2022-01-29 16:28:25

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 0/6] RISC-V IPI Improvements

This series aims to improve IPI support in Linux RISC-V in following ways:
1) Treat IPIs as normal per-CPU interrupts instead of having custom RISC-V
specific hooks. This also makes Linux RISC-V IPI support aligned with
other architectures.
2) Remote TLB flushes and icache flushes should prefer local IPIs instead
of SBI calls whenever we have specialized hardware (such as RISC-V AIA
IMSIC and RISC-V ACLINT) which allows S-mode software to directly inject
IPIs without any assistance from M-mode runtime firmware.

These patches were already part of the "Linux RISC-V ACLINT Support" series
but this now a separate series so that it can be merged independently of
the "Linux RISC-V ACLINT Support" series.
(Refer, https://lore.kernel.org/lkml/[email protected]/)

These patches are also a preparatory patches for the up-coming:
1) Linux RISC-V ACLINT support
2) Linux RISC-V AIA support
3) KVM RISC-V TLB flush improvements

These patches can also be found in riscv_ipi_imp_v2 branch at:
https://github.com/avpatel/linux.git

Changes since v1:
- Use synthetic fwnode for INTC instead of irq_set_default_host() in PATCH2

Anup Patel (6):
RISC-V: Clear SIP bit only when using SBI IPI operations
irqchip/riscv-intc: Create domain using named fwnode
RISC-V: Treat IPIs as normal Linux IRQs
RISC-V: Allow marking IPIs as suitable for remote FENCEs
RISC-V: Use IPIs for remote TLB flush when possible
RISC-V: Use IPIs for remote icache flush when possible

arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/ipi-mux.h | 45 ++++++
arch/riscv/include/asm/irq.h | 2 +
arch/riscv/include/asm/sbi.h | 2 +
arch/riscv/include/asm/smp.h | 49 +++++--
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/cpu-hotplug.c | 3 +-
arch/riscv/kernel/ipi-mux.c | 223 ++++++++++++++++++++++++++++++
arch/riscv/kernel/irq.c | 16 ++-
arch/riscv/kernel/sbi.c | 18 ++-
arch/riscv/kernel/smp.c | 164 +++++++++++-----------
arch/riscv/kernel/smpboot.c | 5 +-
arch/riscv/mm/cacheflush.c | 5 +-
arch/riscv/mm/tlbflush.c | 93 +++++++++++--
drivers/clocksource/timer-clint.c | 21 ++-
drivers/clocksource/timer-riscv.c | 11 +-
drivers/irqchip/irq-riscv-intc.c | 67 ++++-----
drivers/irqchip/irq-sifive-plic.c | 19 +--
18 files changed, 563 insertions(+), 182 deletions(-)
create mode 100644 arch/riscv/include/asm/ipi-mux.h
create mode 100644 arch/riscv/kernel/ipi-mux.c

--
2.25.1


2022-01-29 16:28:28

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 1/6] RISC-V: Clear SIP bit only when using SBI IPI operations

The software interrupt pending (i.e. [M|S]SIP) bit is writeable for
S-mode but read-only for M-mode so we clear this bit only when using
SBI IPI operations.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Bin Meng <[email protected]>
---
arch/riscv/kernel/sbi.c | 8 +++++++-
arch/riscv/kernel/smp.c | 2 --
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index f72527fcb347..9786fc641436 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -621,8 +621,14 @@ static void sbi_send_cpumask_ipi(const struct cpumask *target)
sbi_send_ipi(target);
}

+static void sbi_ipi_clear(void)
+{
+ csr_clear(CSR_IP, IE_SIE);
+}
+
static const struct riscv_ipi_ops sbi_ipi_ops = {
- .ipi_inject = sbi_send_cpumask_ipi
+ .ipi_inject = sbi_send_cpumask_ipi,
+ .ipi_clear = sbi_ipi_clear
};

void __init sbi_init(void)
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index b5d30ea92292..6fd8b3cbec1b 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -89,8 +89,6 @@ void riscv_clear_ipi(void)
{
if (ipi_ops && ipi_ops->ipi_clear)
ipi_ops->ipi_clear();
-
- csr_clear(CSR_IP, IE_SIE);
}
EXPORT_SYMBOL_GPL(riscv_clear_ipi);

--
2.25.1

2022-01-29 16:30:43

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

We should create INTC domain using a synthetic fwnode which will allow
drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
PMU driver, etc) not having dedicated DT/ACPI node to directly create
interrupt mapping for standard local interrupt numbers defined by the
RISC-V privileged specification.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/irq.h | 2 ++
arch/riscv/kernel/irq.c | 13 +++++++++++++
drivers/clocksource/timer-clint.c | 13 +++++++------
drivers/clocksource/timer-riscv.c | 11 ++---------
drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
6 files changed, 45 insertions(+), 25 deletions(-)

diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
index e4c435509983..f85ebaf07505 100644
--- a/arch/riscv/include/asm/irq.h
+++ b/arch/riscv/include/asm/irq.h
@@ -12,6 +12,8 @@

#include <asm-generic/irq.h>

+extern struct fwnode_handle *riscv_intc_fwnode(void);
+
extern void __init init_IRQ(void);

#endif /* _ASM_RISCV_IRQ_H */
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 7207fa08d78f..f2fed78ab659 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -7,9 +7,22 @@

#include <linux/interrupt.h>
#include <linux/irqchip.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
#include <linux/seq_file.h>
#include <asm/smp.h>

+static struct fwnode_handle *intc_fwnode;
+
+struct fwnode_handle *riscv_intc_fwnode(void)
+{
+ if (!intc_fwnode)
+ intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
+
+ return intc_fwnode;
+}
+EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
+
int arch_show_interrupts(struct seq_file *p, int prec)
{
show_ipi_stats(p, prec);
diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c
index 6cfe2ab73eb0..6e5624989525 100644
--- a/drivers/clocksource/timer-clint.c
+++ b/drivers/clocksource/timer-clint.c
@@ -149,6 +149,7 @@ static int __init clint_timer_init_dt(struct device_node *np)
int rc;
u32 i, nr_irqs;
void __iomem *base;
+ struct irq_domain *domain;
struct of_phandle_args oirq;

/*
@@ -169,14 +170,14 @@ static int __init clint_timer_init_dt(struct device_node *np)
np, i, oirq.args[0]);
return -ENODEV;
}
-
- /* Find parent irq domain and map timer irq */
- if (!clint_timer_irq &&
- oirq.args[0] == RV_IRQ_TIMER &&
- irq_find_host(oirq.np))
- clint_timer_irq = irq_of_parse_and_map(np, i);
}

+ /* Find parent irq domain and map timer irq */
+ domain = irq_find_matching_fwnode(riscv_intc_fwnode(),
+ DOMAIN_BUS_ANY);
+ if (!clint_timer_irq && domain)
+ clint_timer_irq = irq_create_mapping(domain, RV_IRQ_TIMER);
+
/* If CLINT timer irq not found then fail */
if (!clint_timer_irq) {
pr_err("%pOFP: timer irq not found\n", np);
diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
index 1767f8bf2013..a98f5d18bab9 100644
--- a/drivers/clocksource/timer-riscv.c
+++ b/drivers/clocksource/timer-riscv.c
@@ -102,7 +102,6 @@ static irqreturn_t riscv_timer_interrupt(int irq, void *dev_id)
static int __init riscv_timer_init_dt(struct device_node *n)
{
int cpuid, hartid, error;
- struct device_node *child;
struct irq_domain *domain;

hartid = riscv_of_processor_hartid(n);
@@ -121,14 +120,8 @@ static int __init riscv_timer_init_dt(struct device_node *n)
if (cpuid != smp_processor_id())
return 0;

- domain = NULL;
- child = of_get_compatible_child(n, "riscv,cpu-intc");
- if (!child) {
- pr_err("Failed to find INTC node [%pOF]\n", n);
- return -ENODEV;
- }
- domain = irq_find_host(child);
- of_node_put(child);
+ domain = irq_find_matching_fwnode(riscv_intc_fwnode(),
+ DOMAIN_BUS_ANY);
if (!domain) {
pr_err("Failed to find IRQ domain for node [%pOF]\n", n);
return -ENODEV;
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index b65bd8878d4f..26ed62c11768 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct device_node *node,
if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
return 0;

- intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
- &riscv_intc_domain_ops, NULL);
+ /*
+ * Create INTC domain using a synthetic fwnode which will allow
+ * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
+ * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
+ * directly create interrupt mapping for standard local interrupt
+ * numbers defined by the RISC-V privileged specification.
+ */
+ intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
+ BITS_PER_LONG,
+ &riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 259065d271ef..2c43ab77c014 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -284,6 +284,7 @@ static int __init plic_init(struct device_node *node,
u32 nr_irqs;
struct plic_priv *priv;
struct plic_handler *handler;
+ struct irq_domain *domain;

priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
@@ -339,14 +340,6 @@ static int __init plic_init(struct device_node *node,
continue;
}

- /* Find parent domain and register chained handler */
- if (!plic_parent_irq && irq_find_host(parent.np)) {
- plic_parent_irq = irq_of_parse_and_map(node, i);
- if (plic_parent_irq)
- irq_set_chained_handler(plic_parent_irq,
- plic_handle_irq);
- }
-
/*
* When running in M-mode we need to ignore the S-mode handler.
* Here we assume it always comes later, but that might be a
@@ -373,6 +366,16 @@ static int __init plic_init(struct device_node *node,
nr_handlers++;
}

+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_intc_fwnode(),
+ DOMAIN_BUS_ANY);
+ if (!plic_parent_irq && domain) {
+ plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (plic_parent_irq)
+ irq_set_chained_handler(plic_parent_irq,
+ plic_handle_irq);
+ }
+
/*
* We can have multiple PLIC instances so setup cpuhp state only
* when context handler for current/boot CPU is present.
--
2.25.1

2022-01-29 16:33:57

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 3/6] RISC-V: Treat IPIs as normal Linux IRQs

Currently, the RISC-V kernel provides arch specific hooks (i.e.
struct riscv_ipi_ops) to register IPI handling methods. The stats
gathering of IPIs is also arch specific in the RISC-V kernel.

Other architectures (such as ARM, ARM64, and MIPS) have moved away
from custom arch specific IPI handling methods. Currently, these
architectures have Linux irqchip drivers providing a range of Linux
IRQ numbers to be used as IPIs and IPI triggering is done using
generic IPI APIs. This approach allows architectures to treat IPIs
as normal Linux IRQs and IPI stats gathering is done by the generic
Linux IRQ subsystem.

We extend the RISC-V IPI handling as-per above approach so that
arch specific IPI handling methods (struct riscv_ipi_ops) can be
removed and the IPI handling is totally contained within Linux
irqchip drivers.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/ipi-mux.h | 43 ++++++
arch/riscv/include/asm/sbi.h | 2 +
arch/riscv/include/asm/smp.h | 35 +++--
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/cpu-hotplug.c | 3 +-
arch/riscv/kernel/ipi-mux.c | 222 ++++++++++++++++++++++++++++++
arch/riscv/kernel/irq.c | 3 +-
arch/riscv/kernel/sbi.c | 13 +-
arch/riscv/kernel/smp.c | 153 ++++++++++----------
arch/riscv/kernel/smpboot.c | 5 +-
drivers/clocksource/timer-clint.c | 8 +-
drivers/irqchip/irq-riscv-intc.c | 55 ++++----
13 files changed, 405 insertions(+), 139 deletions(-)
create mode 100644 arch/riscv/include/asm/ipi-mux.h
create mode 100644 arch/riscv/kernel/ipi-mux.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..167681d6d3f8 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -54,6 +54,7 @@ config RISCV
select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP if MMU
+ select GENERIC_IRQ_IPI
select GENERIC_IRQ_MULTI_HANDLER
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
diff --git a/arch/riscv/include/asm/ipi-mux.h b/arch/riscv/include/asm/ipi-mux.h
new file mode 100644
index 000000000000..988e2bba372a
--- /dev/null
+++ b/arch/riscv/include/asm/ipi-mux.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 Ventana Micro Systems Inc.
+ */
+
+#ifndef _ASM_RISCV_IPI_MUX_H
+#define _ASM_RISCV_IPI_MUX_H
+
+struct cpumask;
+
+#ifdef CONFIG_SMP
+
+/* Handle muxed IPIs */
+void riscv_ipi_mux_handle_irq(void);
+
+/* Create irq_domain for muxed IPIs */
+struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ void (*clear_ipi)(void),
+ void (*send_ipi)(const struct cpumask *mask));
+
+/* Destroy irq_domain for muxed IPIs */
+void riscv_ipi_mux_destroy(struct irq_domain *d);
+
+#else
+
+static inline void riscv_ipi_mux_handle_irq(void)
+{
+}
+
+static inline struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ void (*clear_ipi)(void),
+ void (*send_ipi)(const struct cpumask *mask))
+{
+ return NULL;
+}
+
+static inline void riscv_ipi_mux_destroy(struct irq_domain *d)
+{
+}
+
+#endif
+
+#endif /* _ASM_RISCV_IPI_MUX_H */
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index d1c37479d828..1e9aa7941960 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -116,6 +116,7 @@ struct sbiret {
};

void sbi_init(void);
+void sbi_ipi_init(void);
struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0,
unsigned long arg1, unsigned long arg2,
unsigned long arg3, unsigned long arg4,
@@ -185,6 +186,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
int sbi_err_map_linux_errno(int err);
#else /* CONFIG_RISCV_SBI */
static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
+static inline void sbi_ipi_init(void) { }
static inline void sbi_init(void) {}
#endif /* CONFIG_RISCV_SBI */
#endif /* _ASM_RISCV_SBI_H */
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 23170c933d73..178fe4ada592 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -15,11 +15,6 @@
struct seq_file;
extern unsigned long boot_cpu_hartid;

-struct riscv_ipi_ops {
- void (*ipi_inject)(const struct cpumask *target);
- void (*ipi_clear)(void);
-};
-
#ifdef CONFIG_SMP
/*
* Mapping between linux logical cpu index and hartid.
@@ -33,9 +28,6 @@ void show_ipi_stats(struct seq_file *p, int prec);
/* SMP initialization hook for setup_arch */
void __init setup_smp(void);

-/* Called from C code, this handles an IPI. */
-void handle_IPI(struct pt_regs *regs);
-
/* Hook for the generic smp_call_function_many() routine. */
void arch_send_call_function_ipi_mask(struct cpumask *mask);

@@ -44,11 +36,17 @@ void arch_send_call_function_single_ipi(int cpu);

int riscv_hartid_to_cpuid(int hartid);

-/* Set custom IPI operations */
-void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops);
+/* Enable IPI for CPU hotplug */
+void riscv_ipi_enable(void);
+
+/* Disable IPI for CPU hotplug */
+void riscv_ipi_disable(void);

-/* Clear IPI for current CPU */
-void riscv_clear_ipi(void);
+/* Check if IPI interrupt numbers are available */
+bool riscv_ipi_have_virq_range(void);
+
+/* Set the IPI interrupt numbers for arch (called by irqchip drivers) */
+void riscv_ipi_set_virq_range(int virq, int nr_irqs);

/* Secondary hart entry */
asmlinkage void smp_callin(void);
@@ -82,11 +80,20 @@ static inline unsigned long cpuid_to_hartid_map(int cpu)
return boot_cpu_hartid;
}

-static inline void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops)
+static inline void riscv_ipi_enable(void)
{
}

-static inline void riscv_clear_ipi(void)
+static inline void riscv_ipi_disable(void)
+{
+}
+
+static inline bool riscv_ipi_have_virq_range(void)
+{
+ return false;
+}
+
+static inline void riscv_ipi_set_virq_range(int virq, int nr)
{
}

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 612556faa527..e3cd63a8709a 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -42,6 +42,7 @@ obj-$(CONFIG_RISCV_M_MODE) += traps_misaligned.o
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_SMP) += smpboot.o
obj-$(CONFIG_SMP) += smp.o
+obj-$(CONFIG_SMP) += ipi-mux.o
obj-$(CONFIG_SMP) += cpu_ops.o

obj-$(CONFIG_RISCV_BOOT_SPINWAIT) += cpu_ops_spinwait.o
diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
index be7f05b542bb..d375bfeb08df 100644
--- a/arch/riscv/kernel/cpu-hotplug.c
+++ b/arch/riscv/kernel/cpu-hotplug.c
@@ -12,7 +12,7 @@
#include <linux/sched/hotplug.h>
#include <asm/irq.h>
#include <asm/cpu_ops.h>
-#include <asm/sbi.h>
+#include <asm/smp.h>

bool cpu_has_hotplug(unsigned int cpu)
{
@@ -41,6 +41,7 @@ int __cpu_disable(void)

remove_cpu_topology(cpu);
set_cpu_online(cpu, false);
+ riscv_ipi_disable();
irq_migrate_all_off_this_cpu();

return ret;
diff --git a/arch/riscv/kernel/ipi-mux.c b/arch/riscv/kernel/ipi-mux.c
new file mode 100644
index 000000000000..93835355dccf
--- /dev/null
+++ b/arch/riscv/kernel/ipi-mux.c
@@ -0,0 +1,222 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Multiplex several IPIs over a single HW IPI.
+ *
+ * Copyright (c) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-ipi-mux: " fmt
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/smp.h>
+#include <asm/ipi-mux.h>
+
+struct ipi_mux {
+ struct irq_domain *domain;
+ int parent_virq;
+ void (*clear_ipi)(void);
+ void (*send_ipi)(const struct cpumask *mask);
+};
+
+static struct ipi_mux ipi_mux_priv;
+static DEFINE_PER_CPU(unsigned long, ipi_mux_bits);
+
+static void ipi_mux_dummy(struct irq_data *d)
+{
+}
+
+static void ipi_mux_send_mask(struct irq_data *d, const struct cpumask *mask)
+{
+ int cpu;
+
+ /* Barrier before doing atomic bit update to IPI bits */
+ smp_mb__before_atomic();
+
+ for_each_cpu(cpu, mask)
+ set_bit(d->hwirq, per_cpu_ptr(&ipi_mux_bits, cpu));
+
+ /* Barrier after doing atomic bit update to IPI bits */
+ smp_mb__after_atomic();
+
+ if (ipi_mux_priv.send_ipi)
+ ipi_mux_priv.send_ipi(mask);
+}
+
+static struct irq_chip ipi_mux_chip = {
+ .name = "RISC-V IPI Mux",
+ .irq_mask = ipi_mux_dummy,
+ .irq_unmask = ipi_mux_dummy,
+ .ipi_send_mask = ipi_mux_send_mask,
+};
+
+static int ipi_mux_domain_map(struct irq_domain *d, unsigned int irq,
+ irq_hw_number_t hwirq)
+{
+ irq_set_percpu_devid(irq);
+ irq_domain_set_info(d, irq, hwirq, &ipi_mux_chip, d->host_data,
+ handle_percpu_devid_irq, NULL, NULL);
+
+ return 0;
+}
+
+static int ipi_mux_domain_alloc(struct irq_domain *d, unsigned int virq,
+ unsigned int nr_irqs, void *arg)
+{
+ int i, ret;
+ irq_hw_number_t hwirq;
+ unsigned int type = IRQ_TYPE_NONE;
+ struct irq_fwspec *fwspec = arg;
+
+ ret = irq_domain_translate_onecell(d, fwspec, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ ret = ipi_mux_domain_map(d, virq + i, hwirq + i);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops ipi_mux_domain_ops = {
+ .translate = irq_domain_translate_onecell,
+ .alloc = ipi_mux_domain_alloc,
+ .free = irq_domain_free_irqs_top,
+};
+
+void riscv_ipi_mux_handle_irq(void)
+{
+ int err;
+ unsigned long irqs, *bits = this_cpu_ptr(&ipi_mux_bits);
+ irq_hw_number_t hwirq;
+
+ while (true) {
+ if (ipi_mux_priv.clear_ipi)
+ ipi_mux_priv.clear_ipi();
+
+ /* Order bit clearing and data access. */
+ mb();
+
+ irqs = xchg(bits, 0);
+ if (!irqs)
+ break;
+
+ for_each_set_bit(hwirq, &irqs, BITS_PER_LONG) {
+ err = generic_handle_domain_irq(ipi_mux_priv.domain,
+ hwirq);
+ if (unlikely(err))
+ pr_warn_ratelimited(
+ "can't find mapping for hwirq %lu\n",
+ hwirq);
+ }
+ }
+}
+
+static void ipi_mux_handle_irq(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+
+ chained_irq_enter(chip, desc);
+ riscv_ipi_mux_handle_irq();
+ chained_irq_exit(chip, desc);
+}
+
+static int ipi_mux_dying_cpu(unsigned int cpu)
+{
+ if (ipi_mux_priv.parent_virq)
+ disable_percpu_irq(ipi_mux_priv.parent_virq);
+ return 0;
+}
+
+static int ipi_mux_starting_cpu(unsigned int cpu)
+{
+ if (ipi_mux_priv.parent_virq)
+ enable_percpu_irq(ipi_mux_priv.parent_virq,
+ irq_get_trigger_type(ipi_mux_priv.parent_virq));
+ return 0;
+}
+
+struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ void (*clear_ipi)(void),
+ void (*send_ipi)(const struct cpumask *mask))
+{
+ int virq, parent_virq = 0;
+ struct irq_domain *domain;
+ struct irq_fwspec ipi;
+
+ if (ipi_mux_priv.domain || riscv_ipi_have_virq_range())
+ return NULL;
+
+ if (use_soft_irq) {
+ domain = irq_find_matching_fwnode(riscv_intc_fwnode(),
+ DOMAIN_BUS_ANY);
+ if (!domain) {
+ pr_err("unable to find INTC IRQ domain\n");
+ return NULL;
+ }
+
+ parent_virq = irq_create_mapping(domain, RV_IRQ_SOFT);
+ if (!parent_virq) {
+ pr_err("unable to create INTC IRQ mapping\n");
+ return NULL;
+ }
+ }
+
+ domain = irq_domain_add_linear(NULL, BITS_PER_LONG,
+ &ipi_mux_domain_ops, NULL);
+ if (!domain) {
+ pr_err("unable to add IPI Mux domain\n");
+ goto fail_dispose_mapping;
+ }
+
+ ipi.fwnode = domain->fwnode;
+ ipi.param_count = 1;
+ ipi.param[0] = 0;
+ virq = __irq_domain_alloc_irqs(domain, -1, BITS_PER_LONG,
+ NUMA_NO_NODE, &ipi, false, NULL);
+ if (virq <= 0) {
+ pr_err("unable to alloc IRQs from IPI Mux domain\n");
+ goto fail_domain_remove;
+ }
+
+ ipi_mux_priv.domain = domain;
+ ipi_mux_priv.parent_virq = parent_virq;
+ ipi_mux_priv.clear_ipi = clear_ipi;
+ ipi_mux_priv.send_ipi = send_ipi;
+
+ if (parent_virq)
+ irq_set_chained_handler(parent_virq, ipi_mux_handle_irq);
+
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/ipi-mux:starting",
+ ipi_mux_starting_cpu, ipi_mux_dying_cpu);
+
+ riscv_ipi_set_virq_range(virq, BITS_PER_LONG);
+
+ return ipi_mux_priv.domain;
+
+fail_domain_remove:
+ irq_domain_remove(domain);
+fail_dispose_mapping:
+ if (parent_virq)
+ irq_dispose_mapping(parent_virq);
+ return NULL;
+}
+
+void riscv_ipi_mux_destroy(struct irq_domain *d)
+{
+ if (!d || ipi_mux_priv.domain != d)
+ return;
+
+ irq_domain_remove(ipi_mux_priv.domain);
+ if (ipi_mux_priv.parent_virq)
+ irq_dispose_mapping(ipi_mux_priv.parent_virq);
+ memset(&ipi_mux_priv, 0, sizeof(ipi_mux_priv));
+}
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index f2fed78ab659..f2cfeb3a9f5c 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -10,7 +10,7 @@
#include <linux/irqdomain.h>
#include <linux/module.h>
#include <linux/seq_file.h>
-#include <asm/smp.h>
+#include <asm/sbi.h>

static struct fwnode_handle *intc_fwnode;

@@ -34,4 +34,5 @@ void __init init_IRQ(void)
irqchip_init();
if (!handle_arch_irq)
panic("No interrupt controller found.");
+ sbi_ipi_init();
}
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index 9786fc641436..fa3d92fce9f8 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -5,9 +5,11 @@
* Copyright (c) 2020 Western Digital Corporation or its affiliates.
*/

+#define pr_fmt(fmt) "riscv: " fmt
#include <linux/init.h>
#include <linux/pm.h>
#include <linux/reboot.h>
+#include <asm/ipi-mux.h>
#include <asm/sbi.h>
#include <asm/smp.h>

@@ -626,10 +628,11 @@ static void sbi_ipi_clear(void)
csr_clear(CSR_IP, IE_SIE);
}

-static const struct riscv_ipi_ops sbi_ipi_ops = {
- .ipi_inject = sbi_send_cpumask_ipi,
- .ipi_clear = sbi_ipi_clear
-};
+void __init sbi_ipi_init(void)
+{
+ if (riscv_ipi_mux_create(true, sbi_ipi_clear, sbi_send_cpumask_ipi))
+ pr_info("providing IPIs using SBI IPI extension\n");
+}

void __init sbi_init(void)
{
@@ -677,6 +680,4 @@ void __init sbi_init(void)
__sbi_send_ipi = __sbi_send_ipi_v01;
__sbi_rfence = __sbi_rfence_v01;
}
-
- riscv_set_ipi_ops(&sbi_ipi_ops);
}
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 6fd8b3cbec1b..a9f1aca38358 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -17,9 +17,9 @@
#include <linux/sched.h>
#include <linux/seq_file.h>
#include <linux/delay.h>
+#include <linux/irq.h>
#include <linux/irq_work.h>

-#include <asm/sbi.h>
#include <asm/tlbflush.h>
#include <asm/cacheflush.h>

@@ -41,11 +41,9 @@ void __init smp_setup_processor_id(void)
cpuid_to_hartid_map(0) = boot_cpu_hartid;
}

-/* A collection of single bit ipi messages. */
-static struct {
- unsigned long stats[IPI_MAX] ____cacheline_aligned;
- unsigned long bits ____cacheline_aligned;
-} ipi_data[NR_CPUS] __cacheline_aligned;
+static int ipi_virq_base __ro_after_init;
+static int nr_ipi __ro_after_init = IPI_MAX;
+static struct irq_desc *ipi_desc[IPI_MAX] __read_mostly;

int riscv_hartid_to_cpuid(int hartid)
{
@@ -77,46 +75,14 @@ static void ipi_stop(void)
wait_for_interrupt();
}

-static const struct riscv_ipi_ops *ipi_ops __ro_after_init;
-
-void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops)
-{
- ipi_ops = ops;
-}
-EXPORT_SYMBOL_GPL(riscv_set_ipi_ops);
-
-void riscv_clear_ipi(void)
-{
- if (ipi_ops && ipi_ops->ipi_clear)
- ipi_ops->ipi_clear();
-}
-EXPORT_SYMBOL_GPL(riscv_clear_ipi);
-
static void send_ipi_mask(const struct cpumask *mask, enum ipi_message_type op)
{
- int cpu;
-
- smp_mb__before_atomic();
- for_each_cpu(cpu, mask)
- set_bit(op, &ipi_data[cpu].bits);
- smp_mb__after_atomic();
-
- if (ipi_ops && ipi_ops->ipi_inject)
- ipi_ops->ipi_inject(mask);
- else
- pr_warn("SMP: IPI inject method not available\n");
+ __ipi_send_mask(ipi_desc[op], mask);
}

static void send_ipi_single(int cpu, enum ipi_message_type op)
{
- smp_mb__before_atomic();
- set_bit(op, &ipi_data[cpu].bits);
- smp_mb__after_atomic();
-
- if (ipi_ops && ipi_ops->ipi_inject)
- ipi_ops->ipi_inject(cpumask_of(cpu));
- else
- pr_warn("SMP: IPI inject method not available\n");
+ __ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
}

#ifdef CONFIG_IRQ_WORK
@@ -126,55 +92,88 @@ void arch_irq_work_raise(void)
}
#endif

-void handle_IPI(struct pt_regs *regs)
+static irqreturn_t handle_IPI(int irq, void *data)
+{
+ int ipi = irq - ipi_virq_base;
+
+ switch (ipi) {
+ case IPI_RESCHEDULE:
+ scheduler_ipi();
+ break;
+ case IPI_CALL_FUNC:
+ generic_smp_call_function_interrupt();
+ break;
+ case IPI_CPU_STOP:
+ ipi_stop();
+ break;
+ case IPI_IRQ_WORK:
+ irq_work_run();
+ break;
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+ case IPI_TIMER:
+ tick_receive_broadcast();
+ break;
+#endif
+ default:
+ pr_warn("CPU%d: unhandled IPI%d\n", smp_processor_id(), ipi);
+ break;
+ };
+
+ return IRQ_HANDLED;
+}
+
+void riscv_ipi_enable(void)
{
- unsigned long *pending_ipis = &ipi_data[smp_processor_id()].bits;
- unsigned long *stats = ipi_data[smp_processor_id()].stats;
+ int i;

- riscv_clear_ipi();
+ if (WARN_ON_ONCE(!ipi_virq_base))
+ return;

- while (true) {
- unsigned long ops;
+ for (i = 0; i < nr_ipi; i++)
+ enable_percpu_irq(ipi_virq_base + i, 0);
+}

- /* Order bit clearing and data access. */
- mb();
+void riscv_ipi_disable(void)
+{
+ int i;

- ops = xchg(pending_ipis, 0);
- if (ops == 0)
- return;
+ if (WARN_ON_ONCE(!ipi_virq_base))
+ return;

- if (ops & (1 << IPI_RESCHEDULE)) {
- stats[IPI_RESCHEDULE]++;
- scheduler_ipi();
- }
+ for (i = 0; i < nr_ipi; i++)
+ disable_percpu_irq(ipi_virq_base + i);
+}

- if (ops & (1 << IPI_CALL_FUNC)) {
- stats[IPI_CALL_FUNC]++;
- generic_smp_call_function_interrupt();
- }
+bool riscv_ipi_have_virq_range(void)
+{
+ return (ipi_virq_base) ? true : false;
+}

- if (ops & (1 << IPI_CPU_STOP)) {
- stats[IPI_CPU_STOP]++;
- ipi_stop();
- }
+void riscv_ipi_set_virq_range(int virq, int nr)
+{
+ int i, err;

- if (ops & (1 << IPI_IRQ_WORK)) {
- stats[IPI_IRQ_WORK]++;
- irq_work_run();
- }
+ if (WARN_ON(ipi_virq_base))
+ return;

-#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
- if (ops & (1 << IPI_TIMER)) {
- stats[IPI_TIMER]++;
- tick_receive_broadcast();
- }
-#endif
- BUG_ON((ops >> IPI_MAX) != 0);
+ WARN_ON(nr < IPI_MAX);
+ nr_ipi = min(nr, IPI_MAX);
+ ipi_virq_base = virq;
+
+ /* Request IPIs */
+ for (i = 0; i < nr_ipi; i++) {
+ err = request_percpu_irq(ipi_virq_base + i, handle_IPI,
+ "IPI", &ipi_virq_base);
+ WARN_ON(err);

- /* Order data access and bit testing. */
- mb();
+ ipi_desc[i] = irq_to_desc(ipi_virq_base + i);
+ irq_set_status_flags(ipi_virq_base + i, IRQ_HIDDEN);
}
+
+ /* Enabled IPIs for boot CPU immediately */
+ riscv_ipi_enable();
}
+EXPORT_SYMBOL_GPL(riscv_ipi_set_virq_range);

static const char * const ipi_names[] = {
[IPI_RESCHEDULE] = "Rescheduling interrupts",
@@ -192,7 +191,7 @@ void show_ipi_stats(struct seq_file *p, int prec)
seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i,
prec >= 4 ? " " : "");
for_each_online_cpu(cpu)
- seq_printf(p, "%10lu ", ipi_data[cpu].stats[i]);
+ seq_printf(p, "%10u ", irq_desc_kstat_cpu(ipi_desc[i], cpu));
seq_printf(p, " %s\n", ipi_names[i]);
}
}
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 622f226454d5..e37036e779bb 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -30,7 +30,6 @@
#include <asm/numa.h>
#include <asm/tlbflush.h>
#include <asm/sections.h>
-#include <asm/sbi.h>
#include <asm/smp.h>
#include <asm/alternative.h>

@@ -159,12 +158,12 @@ asmlinkage __visible void smp_callin(void)
struct mm_struct *mm = &init_mm;
unsigned int curr_cpuid = smp_processor_id();

- riscv_clear_ipi();
-
/* All kernel threads share the same mm context. */
mmgrab(mm);
current->active_mm = mm;

+ riscv_ipi_enable();
+
notify_cpu_starting(curr_cpuid);
numa_add_cpu(curr_cpuid);
update_siblings_masks(curr_cpuid);
diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c
index 6e5624989525..d20c093c5564 100644
--- a/drivers/clocksource/timer-clint.c
+++ b/drivers/clocksource/timer-clint.c
@@ -20,6 +20,7 @@
#include <linux/of_irq.h>
#include <linux/smp.h>
#include <linux/timex.h>
+#include <asm/ipi-mux.h>

#ifndef CONFIG_RISCV_M_MODE
#include <asm/clint.h>
@@ -54,11 +55,6 @@ static void clint_clear_ipi(void)
writel(0, clint_ipi_base + cpuid_to_hartid_map(smp_processor_id()));
}

-static struct riscv_ipi_ops clint_ipi_ops = {
- .ipi_inject = clint_send_ipi,
- .ipi_clear = clint_clear_ipi,
-};
-
#ifdef CONFIG_64BIT
#define clint_get_cycles() readq_relaxed(clint_timer_val)
#else
@@ -229,7 +225,7 @@ static int __init clint_timer_init_dt(struct device_node *np)
goto fail_free_irq;
}

- riscv_set_ipi_ops(&clint_ipi_ops);
+ riscv_ipi_mux_create(true, clint_clear_ipi, clint_send_ipi);
clint_clear_ipi();

return 0;
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 26ed62c11768..16a56d8fd36e 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -26,20 +26,7 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
if (unlikely(cause >= BITS_PER_LONG))
panic("unexpected interrupt cause");

- switch (cause) {
-#ifdef CONFIG_SMP
- case RV_IRQ_SOFT:
- /*
- * We only use software interrupts to pass IPIs, so if a
- * non-SMP system gets one, then we don't know what to do.
- */
- handle_IPI(regs);
- break;
-#endif
- default:
- generic_handle_domain_irq(intc_domain, cause);
- break;
- }
+ generic_handle_domain_irq(intc_domain, cause);
}

/*
@@ -59,18 +46,6 @@ static void riscv_intc_irq_unmask(struct irq_data *d)
csr_set(CSR_IE, BIT(d->hwirq));
}

-static int riscv_intc_cpu_starting(unsigned int cpu)
-{
- csr_set(CSR_IE, BIT(RV_IRQ_SOFT));
- return 0;
-}
-
-static int riscv_intc_cpu_dying(unsigned int cpu)
-{
- csr_clear(CSR_IE, BIT(RV_IRQ_SOFT));
- return 0;
-}
-
static struct irq_chip riscv_intc_chip = {
.name = "RISC-V INTC",
.irq_mask = riscv_intc_irq_mask,
@@ -87,9 +62,32 @@ static int riscv_intc_domain_map(struct irq_domain *d, unsigned int irq,
return 0;
}

+static int riscv_intc_domain_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ irq_hw_number_t hwirq;
+ unsigned int type = IRQ_TYPE_NONE;
+ struct irq_fwspec *fwspec = arg;
+
+ ret = irq_domain_translate_onecell(domain, fwspec, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ ret = riscv_intc_domain_map(domain, virq + i, hwirq + i);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static const struct irq_domain_ops riscv_intc_domain_ops = {
.map = riscv_intc_domain_map,
.xlate = irq_domain_xlate_onecell,
+ .alloc = riscv_intc_domain_alloc
};

static int __init riscv_intc_init(struct device_node *node,
@@ -133,11 +131,6 @@ static int __init riscv_intc_init(struct device_node *node,
return rc;
}

- cpuhp_setup_state(CPUHP_AP_IRQ_RISCV_STARTING,
- "irqchip/riscv/intc:starting",
- riscv_intc_cpu_starting,
- riscv_intc_cpu_dying);
-
pr_info("%d local interrupts mapped\n", BITS_PER_LONG);

return 0;
--
2.25.1

2022-01-29 16:33:58

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 4/6] RISC-V: Allow marking IPIs as suitable for remote FENCEs

To do remote FENCEs (i.e. remote TLB flushes) using IPI calls on
the RISC-V kernel, we need hardware mechanism to directly inject
IPI from the RISC-V kernel instead of using SBI calls.

The upcoming ACLINT [M|S]SWI devices and AIA IMSIC devices allow
direct IPI injection from the RISC-V kernel. To support this, we
extend the riscv_ipi_set_virq_range() function so that irqchip
drivers can mark IPIs as suitable for remote FENCEs.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/ipi-mux.h | 2 ++
arch/riscv/include/asm/smp.h | 18 ++++++++++++++++--
arch/riscv/kernel/ipi-mux.c | 3 ++-
arch/riscv/kernel/sbi.c | 3 ++-
arch/riscv/kernel/smp.c | 11 ++++++++++-
drivers/clocksource/timer-clint.c | 2 +-
6 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/include/asm/ipi-mux.h b/arch/riscv/include/asm/ipi-mux.h
index 988e2bba372a..3a5acbf51806 100644
--- a/arch/riscv/include/asm/ipi-mux.h
+++ b/arch/riscv/include/asm/ipi-mux.h
@@ -15,6 +15,7 @@ void riscv_ipi_mux_handle_irq(void);

/* Create irq_domain for muxed IPIs */
struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ bool use_for_rfence,
void (*clear_ipi)(void),
void (*send_ipi)(const struct cpumask *mask));

@@ -28,6 +29,7 @@ static inline void riscv_ipi_mux_handle_irq(void)
}

static inline struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ bool use_for_rfence,
void (*clear_ipi)(void),
void (*send_ipi)(const struct cpumask *mask))
{
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 178fe4ada592..ddd3be1c77b6 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -16,6 +16,9 @@ struct seq_file;
extern unsigned long boot_cpu_hartid;

#ifdef CONFIG_SMP
+
+#include <linux/jump_label.h>
+
/*
* Mapping between linux logical cpu index and hartid.
*/
@@ -46,7 +49,12 @@ void riscv_ipi_disable(void);
bool riscv_ipi_have_virq_range(void);

/* Set the IPI interrupt numbers for arch (called by irqchip drivers) */
-void riscv_ipi_set_virq_range(int virq, int nr_irqs);
+void riscv_ipi_set_virq_range(int virq, int nr_irqs, bool use_for_rfence);
+
+/* Check if we can use IPIs for remote FENCEs */
+DECLARE_STATIC_KEY_FALSE(riscv_ipi_for_rfence);
+#define riscv_use_ipi_for_rfence() \
+ static_branch_unlikely(&riscv_ipi_for_rfence)

/* Secondary hart entry */
asmlinkage void smp_callin(void);
@@ -93,10 +101,16 @@ static inline bool riscv_ipi_have_virq_range(void)
return false;
}

-static inline void riscv_ipi_set_virq_range(int virq, int nr)
+static inline void riscv_ipi_set_virq_range(int virq, int nr,
+ bool use_for_rfence)
{
}

+static inline bool riscv_use_ipi_for_rfence(void)
+{
+ return false;
+}
+
#endif /* CONFIG_SMP */

#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
diff --git a/arch/riscv/kernel/ipi-mux.c b/arch/riscv/kernel/ipi-mux.c
index 93835355dccf..501b20aed179 100644
--- a/arch/riscv/kernel/ipi-mux.c
+++ b/arch/riscv/kernel/ipi-mux.c
@@ -144,6 +144,7 @@ static int ipi_mux_starting_cpu(unsigned int cpu)
}

struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
+ bool use_for_rfence,
void (*clear_ipi)(void),
void (*send_ipi)(const struct cpumask *mask))
{
@@ -198,7 +199,7 @@ struct irq_domain *riscv_ipi_mux_create(bool use_soft_irq,
"irqchip/riscv/ipi-mux:starting",
ipi_mux_starting_cpu, ipi_mux_dying_cpu);

- riscv_ipi_set_virq_range(virq, BITS_PER_LONG);
+ riscv_ipi_set_virq_range(virq, BITS_PER_LONG, use_for_rfence);

return ipi_mux_priv.domain;

diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index fa3d92fce9f8..210d23524771 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -630,7 +630,8 @@ static void sbi_ipi_clear(void)

void __init sbi_ipi_init(void)
{
- if (riscv_ipi_mux_create(true, sbi_ipi_clear, sbi_send_cpumask_ipi))
+ if (riscv_ipi_mux_create(true, false,
+ sbi_ipi_clear, sbi_send_cpumask_ipi))
pr_info("providing IPIs using SBI IPI extension\n");
}

diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index a9f1aca38358..b98d9c319f6f 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -149,7 +149,10 @@ bool riscv_ipi_have_virq_range(void)
return (ipi_virq_base) ? true : false;
}

-void riscv_ipi_set_virq_range(int virq, int nr)
+DEFINE_STATIC_KEY_FALSE(riscv_ipi_for_rfence);
+EXPORT_SYMBOL_GPL(riscv_ipi_for_rfence);
+
+void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence)
{
int i, err;

@@ -172,6 +175,12 @@ void riscv_ipi_set_virq_range(int virq, int nr)

/* Enabled IPIs for boot CPU immediately */
riscv_ipi_enable();
+
+ /* Update RFENCE static key */
+ if (use_for_rfence)
+ static_branch_enable(&riscv_ipi_for_rfence);
+ else
+ static_branch_disable(&riscv_ipi_for_rfence);
}
EXPORT_SYMBOL_GPL(riscv_ipi_set_virq_range);

diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c
index d20c093c5564..bc9be091a732 100644
--- a/drivers/clocksource/timer-clint.c
+++ b/drivers/clocksource/timer-clint.c
@@ -225,7 +225,7 @@ static int __init clint_timer_init_dt(struct device_node *np)
goto fail_free_irq;
}

- riscv_ipi_mux_create(true, clint_clear_ipi, clint_send_ipi);
+ riscv_ipi_mux_create(true, true, clint_clear_ipi, clint_send_ipi);
clint_clear_ipi();

return 0;
--
2.25.1

2022-01-29 16:34:04

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 5/6] RISC-V: Use IPIs for remote TLB flush when possible

If IPI calls are injected using SBI IPI calls then remote TLB flush
using SBI RFENCE calls is much faster because using IPIs for remote
TLB flush would still endup as SBI IPI calls with extra processing
on kernel side.

It is now possible to have specialized hardware (such as RISC-V AIA
and RISC-V ACLINT) which allows S-mode software to directly inject
IPIs without any assistance from M-mode runtime firmware.

This patch extends remote TLB flush functions to use IPIs whenever
underlying IPI operations are suitable for remote FENCEs.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/mm/tlbflush.c | 93 +++++++++++++++++++++++++++++++++-------
1 file changed, 78 insertions(+), 15 deletions(-)

diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 37ed760d007c..27a7db8eb2c4 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -23,14 +23,62 @@ static inline void local_flush_tlb_page_asid(unsigned long addr,
: "memory");
}

+static inline void local_flush_tlb_range(unsigned long start,
+ unsigned long size, unsigned long stride)
+{
+ if (size <= stride)
+ local_flush_tlb_page(start);
+ else
+ local_flush_tlb_all();
+}
+
+static inline void local_flush_tlb_range_asid(unsigned long start,
+ unsigned long size, unsigned long stride, unsigned long asid)
+{
+ if (size <= stride)
+ local_flush_tlb_page_asid(start, asid);
+ else
+ local_flush_tlb_all_asid(asid);
+}
+
+static void __ipi_flush_tlb_all(void *info)
+{
+ local_flush_tlb_all();
+}
+
void flush_tlb_all(void)
{
- sbi_remote_sfence_vma(NULL, 0, -1);
+ if (riscv_use_ipi_for_rfence())
+ on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
+ else
+ sbi_remote_sfence_vma(NULL, 0, -1);
+}
+
+struct flush_tlb_range_data {
+ unsigned long asid;
+ unsigned long start;
+ unsigned long size;
+ unsigned long stride;
+};
+
+static void __ipi_flush_tlb_range_asid(void *info)
+{
+ struct flush_tlb_range_data *d = info;
+
+ local_flush_tlb_range_asid(d->start, d->size, d->stride, d->asid);
+}
+
+static void __ipi_flush_tlb_range(void *info)
+{
+ struct flush_tlb_range_data *d = info;
+
+ local_flush_tlb_range(d->start, d->size, d->stride);
}

-static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
- unsigned long size, unsigned long stride)
+static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
+ unsigned long size, unsigned long stride)
{
+ struct flush_tlb_range_data ftd;
struct cpumask *cmask = mm_cpumask(mm);
unsigned int cpuid;
bool broadcast;
@@ -45,19 +93,34 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
unsigned long asid = atomic_long_read(&mm->context.id);

if (broadcast) {
- sbi_remote_sfence_vma_asid(cmask, start, size, asid);
- } else if (size <= stride) {
- local_flush_tlb_page_asid(start, asid);
+ if (riscv_use_ipi_for_rfence()) {
+ ftd.asid = asid;
+ ftd.start = start;
+ ftd.size = size;
+ ftd.stride = stride;
+ on_each_cpu_mask(cmask,
+ __ipi_flush_tlb_range_asid,
+ &ftd, 1);
+ } else
+ sbi_remote_sfence_vma_asid(cmask,
+ start, size, asid);
} else {
- local_flush_tlb_all_asid(asid);
+ local_flush_tlb_range_asid(start, size, stride, asid);
}
} else {
if (broadcast) {
- sbi_remote_sfence_vma(cmask, start, size);
- } else if (size <= stride) {
- local_flush_tlb_page(start);
+ if (riscv_use_ipi_for_rfence()) {
+ ftd.asid = 0;
+ ftd.start = start;
+ ftd.size = size;
+ ftd.stride = stride;
+ on_each_cpu_mask(cmask,
+ __ipi_flush_tlb_range,
+ &ftd, 1);
+ } else
+ sbi_remote_sfence_vma(cmask, start, size);
} else {
- local_flush_tlb_all();
+ local_flush_tlb_range(start, size, stride);
}
}

@@ -66,23 +129,23 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,

void flush_tlb_mm(struct mm_struct *mm)
{
- __sbi_tlb_flush_range(mm, 0, -1, PAGE_SIZE);
+ __flush_tlb_range(mm, 0, -1, PAGE_SIZE);
}

void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
{
- __sbi_tlb_flush_range(vma->vm_mm, addr, PAGE_SIZE, PAGE_SIZE);
+ __flush_tlb_range(vma->vm_mm, addr, PAGE_SIZE, PAGE_SIZE);
}

void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end)
{
- __sbi_tlb_flush_range(vma->vm_mm, start, end - start, PAGE_SIZE);
+ __flush_tlb_range(vma->vm_mm, start, end - start, PAGE_SIZE);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end)
{
- __sbi_tlb_flush_range(vma->vm_mm, start, end - start, PMD_SIZE);
+ __flush_tlb_range(vma->vm_mm, start, end - start, PMD_SIZE);
}
#endif
--
2.25.1

2022-01-29 16:34:12

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 6/6] RISC-V: Use IPIs for remote icache flush when possible

If IPI calls are injected using SBI IPI calls then remote icache flush
using SBI RFENCE calls is much faster because using IPIs for remote
icache flush would still endup as SBI IPI calls with extra processing
on the kernel side.

It is now possible to have specialized hardware (such as RISC-V AIA
and RISC-V ACLINT) which allows S-mode software to directly inject
IPIs without any assistance from M-mode runtime firmware.

This patch extends remote icache flush functions to use IPIs whenever
underlying IPI operations are suitable for remote FENCEs.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/mm/cacheflush.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 6cb7d96ad9c7..7c7e44aaf791 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -18,7 +18,7 @@ void flush_icache_all(void)
{
local_flush_icache_all();

- if (IS_ENABLED(CONFIG_RISCV_SBI))
+ if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence())
sbi_remote_fence_i(NULL);
else
on_each_cpu(ipi_remote_fence_i, NULL, 1);
@@ -66,7 +66,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
* with flush_icache_deferred().
*/
smp_mb();
- } else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
+ } else if (IS_ENABLED(CONFIG_RISCV_SBI) &&
+ !riscv_use_ipi_for_rfence()) {
sbi_remote_fence_i(&others);
} else {
on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
--
2.25.1

2022-02-17 16:45:25

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On 2022-01-28 05:25, Anup Patel wrote:
> We should create INTC domain using a synthetic fwnode which will allow
> drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
> PMU driver, etc) not having dedicated DT/ACPI node to directly create
> interrupt mapping for standard local interrupt numbers defined by the
> RISC-V privileged specification.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> arch/riscv/include/asm/irq.h | 2 ++
> arch/riscv/kernel/irq.c | 13 +++++++++++++
> drivers/clocksource/timer-clint.c | 13 +++++++------
> drivers/clocksource/timer-riscv.c | 11 ++---------
> drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
> drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
> 6 files changed, 45 insertions(+), 25 deletions(-)
>
> diff --git a/arch/riscv/include/asm/irq.h
> b/arch/riscv/include/asm/irq.h
> index e4c435509983..f85ebaf07505 100644
> --- a/arch/riscv/include/asm/irq.h
> +++ b/arch/riscv/include/asm/irq.h
> @@ -12,6 +12,8 @@
>
> #include <asm-generic/irq.h>
>
> +extern struct fwnode_handle *riscv_intc_fwnode(void);
> +
> extern void __init init_IRQ(void);
>
> #endif /* _ASM_RISCV_IRQ_H */
> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> index 7207fa08d78f..f2fed78ab659 100644
> --- a/arch/riscv/kernel/irq.c
> +++ b/arch/riscv/kernel/irq.c
> @@ -7,9 +7,22 @@
>
> #include <linux/interrupt.h>
> #include <linux/irqchip.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> #include <linux/seq_file.h>
> #include <asm/smp.h>
>
> +static struct fwnode_handle *intc_fwnode;
> +
> +struct fwnode_handle *riscv_intc_fwnode(void)
> +{
> + if (!intc_fwnode)
> + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
> +
> + return intc_fwnode;
> +}
> +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);

Why is this created outside of the root interrupt controller driver?
Furthermore, why do you need to create a new fwnode the first place?
As far as I can tell, the INTC does have a node, and what you don't
have is the firmware linkage between PMU (an others) and the INTC.

what you should have instead is something like:

static struct fwnode_handle *(*__get_root_intc_node)(void);
struct fwnode_handle *riscv_get_root_intc_hwnode(void)
{
if (__get_root_intc_node)
return __get_root_intc_node();

return NULL;
}

and the corresponding registration interface.

But either way, something breaks: the INTC has one node per CPU, and
expect one irqdomain per CPU. Having a single fwnode completely breaks
the INTC driver (and probably the irqdomain list, as we don't check for
duplicate entries).

> diff --git a/drivers/irqchip/irq-riscv-intc.c
> b/drivers/irqchip/irq-riscv-intc.c
> index b65bd8878d4f..26ed62c11768 100644
> --- a/drivers/irqchip/irq-riscv-intc.c
> +++ b/drivers/irqchip/irq-riscv-intc.c
> @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
> device_node *node,
> if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
> return 0;
>
> - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> - &riscv_intc_domain_ops, NULL);
> + /*
> + * Create INTC domain using a synthetic fwnode which will allow
> + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
> + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
> + * directly create interrupt mapping for standard local interrupt
> + * numbers defined by the RISC-V privileged specification.
> + */
> + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
> + BITS_PER_LONG,
> + &riscv_intc_domain_ops, NULL);

This is what I'm talking about. It is simply broken. So either you don't
need a per-CPU node (and the DT was bad the first place), or you
absolutely need
one (and the whole 'well-known/default domain' doesn't work at all).

Either way, this patch is plain wrong.


M.
--
Jazz is not dead. It just smells funny...

2022-02-19 18:45:19

by Jessica Clarke

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On 19 Feb 2022, at 09:32, Marc Zyngier <[email protected]> wrote:
>
> On 2022-02-19 03:38, Anup Patel wrote:
>> On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
>>> On 2022-01-28 05:25, Anup Patel wrote:
>>> > We should create INTC domain using a synthetic fwnode which will allow
>>> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
>>> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
>>> > interrupt mapping for standard local interrupt numbers defined by the
>>> > RISC-V privileged specification.
>>> >
>>> > Signed-off-by: Anup Patel <[email protected]>
>>> > ---
>>> > arch/riscv/include/asm/irq.h | 2 ++
>>> > arch/riscv/kernel/irq.c | 13 +++++++++++++
>>> > drivers/clocksource/timer-clint.c | 13 +++++++------
>>> > drivers/clocksource/timer-riscv.c | 11 ++---------
>>> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
>>> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
>>> > 6 files changed, 45 insertions(+), 25 deletions(-)
>>> >
>>> > diff --git a/arch/riscv/include/asm/irq.h
>>> > b/arch/riscv/include/asm/irq.h
>>> > index e4c435509983..f85ebaf07505 100644
>>> > --- a/arch/riscv/include/asm/irq.h
>>> > +++ b/arch/riscv/include/asm/irq.h
>>> > @@ -12,6 +12,8 @@
>>> >
>>> > #include <asm-generic/irq.h>
>>> >
>>> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
>>> > +
>>> > extern void __init init_IRQ(void);
>>> >
>>> > #endif /* _ASM_RISCV_IRQ_H */
>>> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>>> > index 7207fa08d78f..f2fed78ab659 100644
>>> > --- a/arch/riscv/kernel/irq.c
>>> > +++ b/arch/riscv/kernel/irq.c
>>> > @@ -7,9 +7,22 @@
>>> >
>>> > #include <linux/interrupt.h>
>>> > #include <linux/irqchip.h>
>>> > +#include <linux/irqdomain.h>
>>> > +#include <linux/module.h>
>>> > #include <linux/seq_file.h>
>>> > #include <asm/smp.h>
>>> >
>>> > +static struct fwnode_handle *intc_fwnode;
>>> > +
>>> > +struct fwnode_handle *riscv_intc_fwnode(void)
>>> > +{
>>> > + if (!intc_fwnode)
>>> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
>>> > +
>>> > + return intc_fwnode;
>>> > +}
>>> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
>>> Why is this created outside of the root interrupt controller driver?
>>> Furthermore, why do you need to create a new fwnode the first place?
>>> As far as I can tell, the INTC does have a node, and what you don't
>>> have is the firmware linkage between PMU (an others) and the INTC.
>> Fair enough, I will update this patch to not create a synthetic fwnode.
>> The issue is not with INTC driver. We have other drivers and places
>> (such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
>> where we don't have a way to locate INTC fwnode.
>
> And that's exactly what I am talking about: The INTC is OK (sort of),
> but the firmware is too crap for words, and isn't even able to expose
> where the various endpoints route their interrupts to.
>
> Yes, this is probably fine today because you can describe the topology
> of RISC-V systems on the surface of a post stamp. Once you get to the
> complexity of a server-grade SoC (or worse, a mobile phone style SoC),
> this *implicit topology* stuff doesn't fly, because there is no guarantee
> that all endpoints will always all point to the same controller.
>
>>> what you should have instead is something like:
>>> static struct fwnode_handle *(*__get_root_intc_node)(void);
>>> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
>>> {
>>> if (__get_root_intc_node)
>>> return __get_root_intc_node();
>>> return NULL;
>>> }
>>> and the corresponding registration interface.
>> Thanks, I will follow this suggestion. This is a much better approach
>> and it will avoid touching existing drivers.
>>> But either way, something breaks: the INTC has one node per CPU, and
>>> expect one irqdomain per CPU. Having a single fwnode completely breaks
>>> the INTC driver (and probably the irqdomain list, as we don't check for
>>> duplicate entries).
>>> > diff --git a/drivers/irqchip/irq-riscv-intc.c
>>> > b/drivers/irqchip/irq-riscv-intc.c
>>> > index b65bd8878d4f..26ed62c11768 100644
>>> > --- a/drivers/irqchip/irq-riscv-intc.c
>>> > +++ b/drivers/irqchip/irq-riscv-intc.c
>>> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
>>> > device_node *node,
>>> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
>>> > return 0;
>>> >
>>> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
>>> > - &riscv_intc_domain_ops, NULL);
>>> > + /*
>>> > + * Create INTC domain using a synthetic fwnode which will allow
>>> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
>>> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
>>> > + * directly create interrupt mapping for standard local interrupt
>>> > + * numbers defined by the RISC-V privileged specification.
>>> > + */
>>> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
>>> > + BITS_PER_LONG,
>>> > + &riscv_intc_domain_ops, NULL);
>>> This is what I'm talking about. It is simply broken. So either you don't
>>> need a per-CPU node (and the DT was bad the first place), or you
>>> absolutely need
>>> one (and the whole 'well-known/default domain' doesn't work at all).
>>> Either way, this patch is plain wrong.
>> Okay, I will update this patch with the new approach which you suggested.
>
> But how do you plan to work around the fact that everything is currently
> build around having a node (and an irqdomain) per CPU? The PLIC, for example,
> clearly has one parent per CPU, not one global parent.
>
> I'm sure there was a good reason for this, and I suspect merging the domains
> will simply end up breaking things.

On the contrary, the drivers rely on the controller being the same
across all harts, with riscv_intc_init skipping initialisation for all
but the boot hart’s controller. The bindings are a complete pain to
deal with as a result, what you *want* is like you have in the Arm
world where there is just one interrupt controller in the device tree
with some of the interrupts per-processor, but instead we have this
overengineered nuisance. The only reason there are per-hart interrupt
controllers is because that’s how the contexts for the CLINT/PLIC are
specified, but that really should have been done another way rather
than abusing the interrupts-extended property for that. In the FreeBSD
world we’ve been totally ignoring the device tree nodes for the local
interrupt controllers but for my AIA and ACLINT branch I started a few
months ago (though ACLINT's now been completely screwed up by RVI
politics, things have been renamed and split up differently in the past
few days and software interrupts de-prioritised with no current path to
ratification, so that was a waste of my time) I just hang the driver
off the boot hart’s node and leave all the others as totally ignored
and a waste of space other than to figure out the contexts for the PLIC
etc.

TL;DR yes the bindings are awful, no there’s no issue with merging the
domains.

Jess

2022-02-19 18:50:28

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On Sat, Feb 19, 2022 at 3:02 PM Marc Zyngier <[email protected]> wrote:
>
> On 2022-02-19 03:38, Anup Patel wrote:
> > On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
> >>
> >> On 2022-01-28 05:25, Anup Patel wrote:
> >> > We should create INTC domain using a synthetic fwnode which will allow
> >> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
> >> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
> >> > interrupt mapping for standard local interrupt numbers defined by the
> >> > RISC-V privileged specification.
> >> >
> >> > Signed-off-by: Anup Patel <[email protected]>
> >> > ---
> >> > arch/riscv/include/asm/irq.h | 2 ++
> >> > arch/riscv/kernel/irq.c | 13 +++++++++++++
> >> > drivers/clocksource/timer-clint.c | 13 +++++++------
> >> > drivers/clocksource/timer-riscv.c | 11 ++---------
> >> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
> >> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
> >> > 6 files changed, 45 insertions(+), 25 deletions(-)
> >> >
> >> > diff --git a/arch/riscv/include/asm/irq.h
> >> > b/arch/riscv/include/asm/irq.h
> >> > index e4c435509983..f85ebaf07505 100644
> >> > --- a/arch/riscv/include/asm/irq.h
> >> > +++ b/arch/riscv/include/asm/irq.h
> >> > @@ -12,6 +12,8 @@
> >> >
> >> > #include <asm-generic/irq.h>
> >> >
> >> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
> >> > +
> >> > extern void __init init_IRQ(void);
> >> >
> >> > #endif /* _ASM_RISCV_IRQ_H */
> >> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> >> > index 7207fa08d78f..f2fed78ab659 100644
> >> > --- a/arch/riscv/kernel/irq.c
> >> > +++ b/arch/riscv/kernel/irq.c
> >> > @@ -7,9 +7,22 @@
> >> >
> >> > #include <linux/interrupt.h>
> >> > #include <linux/irqchip.h>
> >> > +#include <linux/irqdomain.h>
> >> > +#include <linux/module.h>
> >> > #include <linux/seq_file.h>
> >> > #include <asm/smp.h>
> >> >
> >> > +static struct fwnode_handle *intc_fwnode;
> >> > +
> >> > +struct fwnode_handle *riscv_intc_fwnode(void)
> >> > +{
> >> > + if (!intc_fwnode)
> >> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
> >> > +
> >> > + return intc_fwnode;
> >> > +}
> >> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
> >>
> >> Why is this created outside of the root interrupt controller driver?
> >> Furthermore, why do you need to create a new fwnode the first place?
> >> As far as I can tell, the INTC does have a node, and what you don't
> >> have is the firmware linkage between PMU (an others) and the INTC.
> >
> > Fair enough, I will update this patch to not create a synthetic fwnode.
> >
> > The issue is not with INTC driver. We have other drivers and places
> > (such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
> > where we don't have a way to locate INTC fwnode.
>
> And that's exactly what I am talking about: The INTC is OK (sort of),
> but the firmware is too crap for words, and isn't even able to expose
> where the various endpoints route their interrupts to.

The firmware is always present at a higher privilege-level hence there
is no DT node to discover the presence of firmware. The local interrupts
used by the firmware for IPI, Timer and PMU are defined by the RISC-V
ISA specification.

>
> Yes, this is probably fine today because you can describe the topology
> of RISC-V systems on the surface of a post stamp. Once you get to the
> complexity of a server-grade SoC (or worse, a mobile phone style SoC),
> this *implicit topology* stuff doesn't fly, because there is no
> guarantee
> that all endpoints will always all point to the same controller.

The local interrupts (per-CPU) are always managed by the INTC. The
interrupt controllers to manage device interrupts (such as PLIC) can
vary from platform to platform and have INTC as the parent domain.

We already have high-end interrupt controllers (such as AIA) under
development which are scalable for server-grade SoC, mobile SoC
and various other types of SoCs.

We are able to describe the topology of different types of interrupt
controllers (PLIC as well as AIA) in DT.

The only issue is for drivers which do not have dedicated DT node
(such as SBI IPI, SBI Timer, SBI PMU, or KVM RISC-V) but the
upside is that local interrupt numbers used by these drivers is
clearly defined by the RISC-V ISA specification:

Here are the local interrupts defined by the RISC-V ISA spec:
IRQ13 => PMU overflow interrupt (used by SBI PMU driver)
IRQ12 => S-mode guest external interrupt (to be used by KVM RISC-V)
IRQ11 => M-mode external interrupt (used by firmware)
IRQ9 => S-mode external interrupt (used by PLIC driver)
IRQ7 => M-mode timer interrupt
IRQ5 => S-mode timer interrupt (used by SBI Timer driver)
IRQ3 => M-mode software interrupt (used by firmware)
IRQ1 => S-mode software interrupt (used by SBI IPI driver)

>
> >> what you should have instead is something like:
> >>
> >> static struct fwnode_handle *(*__get_root_intc_node)(void);
> >> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
> >> {
> >> if (__get_root_intc_node)
> >> return __get_root_intc_node();
> >>
> >> return NULL;
> >> }
> >>
> >> and the corresponding registration interface.
> >
> > Thanks, I will follow this suggestion. This is a much better approach
> > and it will avoid touching existing drivers.
> >
> >>
> >> But either way, something breaks: the INTC has one node per CPU, and
> >> expect one irqdomain per CPU. Having a single fwnode completely breaks
> >> the INTC driver (and probably the irqdomain list, as we don't check
> >> for
> >> duplicate entries).
> >>
> >> > diff --git a/drivers/irqchip/irq-riscv-intc.c
> >> > b/drivers/irqchip/irq-riscv-intc.c
> >> > index b65bd8878d4f..26ed62c11768 100644
> >> > --- a/drivers/irqchip/irq-riscv-intc.c
> >> > +++ b/drivers/irqchip/irq-riscv-intc.c
> >> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
> >> > device_node *node,
> >> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
> >> > return 0;
> >> >
> >> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> >> > - &riscv_intc_domain_ops, NULL);
> >> > + /*
> >> > + * Create INTC domain using a synthetic fwnode which will allow
> >> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
> >> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
> >> > + * directly create interrupt mapping for standard local interrupt
> >> > + * numbers defined by the RISC-V privileged specification.
> >> > + */
> >> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
> >> > + BITS_PER_LONG,
> >> > + &riscv_intc_domain_ops, NULL);
> >>
> >> This is what I'm talking about. It is simply broken. So either you
> >> don't
> >> need a per-CPU node (and the DT was bad the first place), or you
> >> absolutely need
> >> one (and the whole 'well-known/default domain' doesn't work at all).
> >>
> >> Either way, this patch is plain wrong.
> >
> > Okay, I will update this patch with the new approach which you
> > suggested.
>
> But how do you plan to work around the fact that everything is currently
> build around having a node (and an irqdomain) per CPU? The PLIC, for
> example,
> clearly has one parent per CPU, not one global parent.
>
> I'm sure there was a good reason for this, and I suspect merging the
> domains
> will simply end up breaking things.

We can have multiple PLIC instances in a platform and each PLIC
instance targets a subset of CPUs. Further, each PLIC instance has
multiple PLIC contexts for associated CPUs.

The per-CPU INTC DT nodes and the "interrupts-extended" DT
property of PLIC DT node helps us describe the association
between various PLIC contexts and CPUs.

Here's an example PLIC DT node:

plic: interrupt-controller@c000000 {
#address-cells = <0>;
#interrupt-cells = <1>;
compatible = "sifive,fu540-c000-plic", "sifive,plic-1.0.0";
interrupt-controller;
interrupts-extended = <&cpu0_intc 11>,
<&cpu1_intc 11>, <&cpu1_intc 9>,
<&cpu2_intc 11>, <&cpu2_intc 9>,
<&cpu3_intc 11>, <&cpu3_intc 9>,
<&cpu4_intc 11>, <&cpu4_intc 9>;
reg = <0xc000000 0x4000000>;
riscv,ndev = <10>;
};

In above above example, PLIC has 9 contexts and context
to CPU connections are as follows:
PLIC context0 => CPU0 M-mode external interrupt
PLIC context1 => CPU1 M-mode external interrupt
PLIC context2 => CPU1 S-mode external interrupt
PLIC context3 => CPU2 M-mode external interrupt
PLIC context4 => CPU2 S-mode external interrupt
....

This is just one example and we can describe any kind of
PLIC context to CPU connections using "interrupts-extended"
DT property.

The same level of flexibility is provided by AIA interrupt
controllers which are under development.

Regards,
Anup

>
> M.
> --
> Jazz is not dead. It just smells funny...

2022-02-20 11:51:31

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On 2022-02-19 03:38, Anup Patel wrote:
> On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
>>
>> On 2022-01-28 05:25, Anup Patel wrote:
>> > We should create INTC domain using a synthetic fwnode which will allow
>> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
>> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
>> > interrupt mapping for standard local interrupt numbers defined by the
>> > RISC-V privileged specification.
>> >
>> > Signed-off-by: Anup Patel <[email protected]>
>> > ---
>> > arch/riscv/include/asm/irq.h | 2 ++
>> > arch/riscv/kernel/irq.c | 13 +++++++++++++
>> > drivers/clocksource/timer-clint.c | 13 +++++++------
>> > drivers/clocksource/timer-riscv.c | 11 ++---------
>> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
>> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
>> > 6 files changed, 45 insertions(+), 25 deletions(-)
>> >
>> > diff --git a/arch/riscv/include/asm/irq.h
>> > b/arch/riscv/include/asm/irq.h
>> > index e4c435509983..f85ebaf07505 100644
>> > --- a/arch/riscv/include/asm/irq.h
>> > +++ b/arch/riscv/include/asm/irq.h
>> > @@ -12,6 +12,8 @@
>> >
>> > #include <asm-generic/irq.h>
>> >
>> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
>> > +
>> > extern void __init init_IRQ(void);
>> >
>> > #endif /* _ASM_RISCV_IRQ_H */
>> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>> > index 7207fa08d78f..f2fed78ab659 100644
>> > --- a/arch/riscv/kernel/irq.c
>> > +++ b/arch/riscv/kernel/irq.c
>> > @@ -7,9 +7,22 @@
>> >
>> > #include <linux/interrupt.h>
>> > #include <linux/irqchip.h>
>> > +#include <linux/irqdomain.h>
>> > +#include <linux/module.h>
>> > #include <linux/seq_file.h>
>> > #include <asm/smp.h>
>> >
>> > +static struct fwnode_handle *intc_fwnode;
>> > +
>> > +struct fwnode_handle *riscv_intc_fwnode(void)
>> > +{
>> > + if (!intc_fwnode)
>> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
>> > +
>> > + return intc_fwnode;
>> > +}
>> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
>>
>> Why is this created outside of the root interrupt controller driver?
>> Furthermore, why do you need to create a new fwnode the first place?
>> As far as I can tell, the INTC does have a node, and what you don't
>> have is the firmware linkage between PMU (an others) and the INTC.
>
> Fair enough, I will update this patch to not create a synthetic fwnode.
>
> The issue is not with INTC driver. We have other drivers and places
> (such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
> where we don't have a way to locate INTC fwnode.

And that's exactly what I am talking about: The INTC is OK (sort of),
but the firmware is too crap for words, and isn't even able to expose
where the various endpoints route their interrupts to.

Yes, this is probably fine today because you can describe the topology
of RISC-V systems on the surface of a post stamp. Once you get to the
complexity of a server-grade SoC (or worse, a mobile phone style SoC),
this *implicit topology* stuff doesn't fly, because there is no
guarantee
that all endpoints will always all point to the same controller.

>> what you should have instead is something like:
>>
>> static struct fwnode_handle *(*__get_root_intc_node)(void);
>> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
>> {
>> if (__get_root_intc_node)
>> return __get_root_intc_node();
>>
>> return NULL;
>> }
>>
>> and the corresponding registration interface.
>
> Thanks, I will follow this suggestion. This is a much better approach
> and it will avoid touching existing drivers.
>
>>
>> But either way, something breaks: the INTC has one node per CPU, and
>> expect one irqdomain per CPU. Having a single fwnode completely breaks
>> the INTC driver (and probably the irqdomain list, as we don't check
>> for
>> duplicate entries).
>>
>> > diff --git a/drivers/irqchip/irq-riscv-intc.c
>> > b/drivers/irqchip/irq-riscv-intc.c
>> > index b65bd8878d4f..26ed62c11768 100644
>> > --- a/drivers/irqchip/irq-riscv-intc.c
>> > +++ b/drivers/irqchip/irq-riscv-intc.c
>> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
>> > device_node *node,
>> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
>> > return 0;
>> >
>> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
>> > - &riscv_intc_domain_ops, NULL);
>> > + /*
>> > + * Create INTC domain using a synthetic fwnode which will allow
>> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
>> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
>> > + * directly create interrupt mapping for standard local interrupt
>> > + * numbers defined by the RISC-V privileged specification.
>> > + */
>> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
>> > + BITS_PER_LONG,
>> > + &riscv_intc_domain_ops, NULL);
>>
>> This is what I'm talking about. It is simply broken. So either you
>> don't
>> need a per-CPU node (and the DT was bad the first place), or you
>> absolutely need
>> one (and the whole 'well-known/default domain' doesn't work at all).
>>
>> Either way, this patch is plain wrong.
>
> Okay, I will update this patch with the new approach which you
> suggested.

But how do you plan to work around the fact that everything is currently
build around having a node (and an irqdomain) per CPU? The PLIC, for
example,
clearly has one parent per CPU, not one global parent.

I'm sure there was a good reason for this, and I suspect merging the
domains
will simply end up breaking things.

M.
--
Jazz is not dead. It just smells funny...

2022-02-21 08:32:36

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
>
> On 2022-01-28 05:25, Anup Patel wrote:
> > We should create INTC domain using a synthetic fwnode which will allow
> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
> > interrupt mapping for standard local interrupt numbers defined by the
> > RISC-V privileged specification.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > arch/riscv/include/asm/irq.h | 2 ++
> > arch/riscv/kernel/irq.c | 13 +++++++++++++
> > drivers/clocksource/timer-clint.c | 13 +++++++------
> > drivers/clocksource/timer-riscv.c | 11 ++---------
> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
> > 6 files changed, 45 insertions(+), 25 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/irq.h
> > b/arch/riscv/include/asm/irq.h
> > index e4c435509983..f85ebaf07505 100644
> > --- a/arch/riscv/include/asm/irq.h
> > +++ b/arch/riscv/include/asm/irq.h
> > @@ -12,6 +12,8 @@
> >
> > #include <asm-generic/irq.h>
> >
> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
> > +
> > extern void __init init_IRQ(void);
> >
> > #endif /* _ASM_RISCV_IRQ_H */
> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> > index 7207fa08d78f..f2fed78ab659 100644
> > --- a/arch/riscv/kernel/irq.c
> > +++ b/arch/riscv/kernel/irq.c
> > @@ -7,9 +7,22 @@
> >
> > #include <linux/interrupt.h>
> > #include <linux/irqchip.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > #include <linux/seq_file.h>
> > #include <asm/smp.h>
> >
> > +static struct fwnode_handle *intc_fwnode;
> > +
> > +struct fwnode_handle *riscv_intc_fwnode(void)
> > +{
> > + if (!intc_fwnode)
> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
> > +
> > + return intc_fwnode;
> > +}
> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
>
> Why is this created outside of the root interrupt controller driver?
> Furthermore, why do you need to create a new fwnode the first place?
> As far as I can tell, the INTC does have a node, and what you don't
> have is the firmware linkage between PMU (an others) and the INTC.

Fair enough, I will update this patch to not create a synthetic fwnode.

The issue is not with INTC driver. We have other drivers and places
(such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
where we don't have a way to locate INTC fwnode.

>
> what you should have instead is something like:
>
> static struct fwnode_handle *(*__get_root_intc_node)(void);
> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
> {
> if (__get_root_intc_node)
> return __get_root_intc_node();
>
> return NULL;
> }
>
> and the corresponding registration interface.

Thanks, I will follow this suggestion. This is a much better approach
and it will avoid touching existing drivers.

>
> But either way, something breaks: the INTC has one node per CPU, and
> expect one irqdomain per CPU. Having a single fwnode completely breaks
> the INTC driver (and probably the irqdomain list, as we don't check for
> duplicate entries).
>
> > diff --git a/drivers/irqchip/irq-riscv-intc.c
> > b/drivers/irqchip/irq-riscv-intc.c
> > index b65bd8878d4f..26ed62c11768 100644
> > --- a/drivers/irqchip/irq-riscv-intc.c
> > +++ b/drivers/irqchip/irq-riscv-intc.c
> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
> > device_node *node,
> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
> > return 0;
> >
> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> > - &riscv_intc_domain_ops, NULL);
> > + /*
> > + * Create INTC domain using a synthetic fwnode which will allow
> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
> > + * directly create interrupt mapping for standard local interrupt
> > + * numbers defined by the RISC-V privileged specification.
> > + */
> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
> > + BITS_PER_LONG,
> > + &riscv_intc_domain_ops, NULL);
>
> This is what I'm talking about. It is simply broken. So either you don't
> need a per-CPU node (and the DT was bad the first place), or you
> absolutely need
> one (and the whole 'well-known/default domain' doesn't work at all).
>
> Either way, this patch is plain wrong.

Okay, I will update this patch with the new approach which you suggested.

Regards,
Anup

>
>
> M.
> --
> Jazz is not dead. It just smells funny...

2022-02-21 16:24:01

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On Sat, 19 Feb 2022 14:51:22 +0000,
Jessica Clarke <[email protected]> wrote:
>
> On 19 Feb 2022, at 09:32, Marc Zyngier <[email protected]> wrote:
> >
> > But how do you plan to work around the fact that everything is currently
> > build around having a node (and an irqdomain) per CPU? The PLIC, for example,
> > clearly has one parent per CPU, not one global parent.
> >
> > I'm sure there was a good reason for this, and I suspect merging the domains
> > will simply end up breaking things.
>
> On the contrary, the drivers rely on the controller being the same
> across all harts, with riscv_intc_init skipping initialisation for all
> but the boot hart’s controller. The bindings are a complete pain to
> deal with as a result, what you *want* is like you have in the Arm
> world where there is just one interrupt controller in the device tree
> with some of the interrupts per-processor, but instead we have this
> overengineered nuisance. The only reason there are per-hart interrupt
> controllers is because that’s how the contexts for the CLINT/PLIC are
> specified, but that really should have been done another way rather
> than abusing the interrupts-extended property for that. In the FreeBSD
> world we’ve been totally ignoring the device tree nodes for the local
> interrupt controllers but for my AIA and ACLINT branch I started a few
> months ago (though ACLINT's now been completely screwed up by RVI
> politics, things have been renamed and split up differently in the past
> few days and software interrupts de-prioritised with no current path to
> ratification, so that was a waste of my time) I just hang the driver
> off the boot hart’s node and leave all the others as totally ignored
> and a waste of space other than to figure out the contexts for the PLIC
> etc.
>
> TL;DR yes the bindings are awful, no there’s no issue with merging the
> domains.

I don't know how that flies with something like[1], where CPU0 only
gets interrupts in M-Mode and not S-Mode. Maybe it doesn't really
matter, but this sort of asymmetric routing is totally backward.

It sometime feels like the RV folks are actively trying to make this
architecture a mess... :-/

M.

[1] CAAhSdy0jTTDzoc+3T_8uLiWfBN3AFCWj99Ayc-Yh8FBfzUY2sQ@mail.gmail.com

--
Without deviation from the norm, progress is not possible.

2022-02-21 22:33:32

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On Mon, Feb 21, 2022 at 2:55 PM Marc Zyngier <[email protected]> wrote:
>
> On Sat, 19 Feb 2022 14:51:22 +0000,
> Jessica Clarke <[email protected]> wrote:
> >
> > On 19 Feb 2022, at 09:32, Marc Zyngier <[email protected]> wrote:
> > >
> > > But how do you plan to work around the fact that everything is currently
> > > build around having a node (and an irqdomain) per CPU? The PLIC, for example,
> > > clearly has one parent per CPU, not one global parent.
> > >
> > > I'm sure there was a good reason for this, and I suspect merging the domains
> > > will simply end up breaking things.
> >
> > On the contrary, the drivers rely on the controller being the same
> > across all harts, with riscv_intc_init skipping initialisation for all
> > but the boot hart’s controller. The bindings are a complete pain to
> > deal with as a result, what you *want* is like you have in the Arm
> > world where there is just one interrupt controller in the device tree
> > with some of the interrupts per-processor, but instead we have this
> > overengineered nuisance. The only reason there are per-hart interrupt
> > controllers is because that’s how the contexts for the CLINT/PLIC are
> > specified, but that really should have been done another way rather
> > than abusing the interrupts-extended property for that. In the FreeBSD
> > world we’ve been totally ignoring the device tree nodes for the local
> > interrupt controllers but for my AIA and ACLINT branch I started a few
> > months ago (though ACLINT's now been completely screwed up by RVI
> > politics, things have been renamed and split up differently in the past
> > few days and software interrupts de-prioritised with no current path to
> > ratification, so that was a waste of my time) I just hang the driver
> > off the boot hart’s node and leave all the others as totally ignored
> > and a waste of space other than to figure out the contexts for the PLIC
> > etc.
> >
> > TL;DR yes the bindings are awful, no there’s no issue with merging the
> > domains.
>
> I don't know how that flies with something like[1], where CPU0 only
> gets interrupts in M-Mode and not S-Mode. Maybe it doesn't really
> matter, but this sort of asymmetric routing is totally backward.

The example PLIC DT node which I provided is from SiFive FU540, where
we have 5 CPUs. The CPU0 in FU540 is a cache-coherent microcontroller
having only M-mode (i.e. No MMU hence not Linux capable).

>
> It sometime feels like the RV folks are actively trying to make this
> architecture a mess... :-/

Well, I still fail to understand what is messy here.

Regards,
Anup

>
> M.
>
> [1] CAAhSdy0jTTDzoc+3T_8uLiWfBN3AFCWj99Ayc-Yh8FBfzUY2sQ@mail.gmail.com
>
> --
> Without deviation from the norm, progress is not possible.

2022-02-22 04:42:47

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On Mon, Feb 21, 2022 at 2:37 PM Marc Zyngier <[email protected]> wrote:
>
> On 2022-02-19 13:03, Anup Patel wrote:
> > On Sat, Feb 19, 2022 at 3:02 PM Marc Zyngier <[email protected]> wrote:
> >>
> >> On 2022-02-19 03:38, Anup Patel wrote:
> >> > On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
> >> >>
> >> >> On 2022-01-28 05:25, Anup Patel wrote:
> >> >> > We should create INTC domain using a synthetic fwnode which will allow
> >> >> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
> >> >> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
> >> >> > interrupt mapping for standard local interrupt numbers defined by the
> >> >> > RISC-V privileged specification.
> >> >> >
> >> >> > Signed-off-by: Anup Patel <[email protected]>
> >> >> > ---
> >> >> > arch/riscv/include/asm/irq.h | 2 ++
> >> >> > arch/riscv/kernel/irq.c | 13 +++++++++++++
> >> >> > drivers/clocksource/timer-clint.c | 13 +++++++------
> >> >> > drivers/clocksource/timer-riscv.c | 11 ++---------
> >> >> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
> >> >> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
> >> >> > 6 files changed, 45 insertions(+), 25 deletions(-)
> >> >> >
> >> >> > diff --git a/arch/riscv/include/asm/irq.h
> >> >> > b/arch/riscv/include/asm/irq.h
> >> >> > index e4c435509983..f85ebaf07505 100644
> >> >> > --- a/arch/riscv/include/asm/irq.h
> >> >> > +++ b/arch/riscv/include/asm/irq.h
> >> >> > @@ -12,6 +12,8 @@
> >> >> >
> >> >> > #include <asm-generic/irq.h>
> >> >> >
> >> >> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
> >> >> > +
> >> >> > extern void __init init_IRQ(void);
> >> >> >
> >> >> > #endif /* _ASM_RISCV_IRQ_H */
> >> >> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> >> >> > index 7207fa08d78f..f2fed78ab659 100644
> >> >> > --- a/arch/riscv/kernel/irq.c
> >> >> > +++ b/arch/riscv/kernel/irq.c
> >> >> > @@ -7,9 +7,22 @@
> >> >> >
> >> >> > #include <linux/interrupt.h>
> >> >> > #include <linux/irqchip.h>
> >> >> > +#include <linux/irqdomain.h>
> >> >> > +#include <linux/module.h>
> >> >> > #include <linux/seq_file.h>
> >> >> > #include <asm/smp.h>
> >> >> >
> >> >> > +static struct fwnode_handle *intc_fwnode;
> >> >> > +
> >> >> > +struct fwnode_handle *riscv_intc_fwnode(void)
> >> >> > +{
> >> >> > + if (!intc_fwnode)
> >> >> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
> >> >> > +
> >> >> > + return intc_fwnode;
> >> >> > +}
> >> >> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
> >> >>
> >> >> Why is this created outside of the root interrupt controller driver?
> >> >> Furthermore, why do you need to create a new fwnode the first place?
> >> >> As far as I can tell, the INTC does have a node, and what you don't
> >> >> have is the firmware linkage between PMU (an others) and the INTC.
> >> >
> >> > Fair enough, I will update this patch to not create a synthetic fwnode.
> >> >
> >> > The issue is not with INTC driver. We have other drivers and places
> >> > (such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
> >> > where we don't have a way to locate INTC fwnode.
> >>
> >> And that's exactly what I am talking about: The INTC is OK (sort of),
> >> but the firmware is too crap for words, and isn't even able to expose
> >> where the various endpoints route their interrupts to.
> >
> > The firmware is always present at a higher privilege-level hence there
> > is no DT node to discover the presence of firmware. The local
> > interrupts
> > used by the firmware for IPI, Timer and PMU are defined by the RISC-V
> > ISA specification.
> >
> >>
> >> Yes, this is probably fine today because you can describe the topology
> >> of RISC-V systems on the surface of a post stamp. Once you get to the
> >> complexity of a server-grade SoC (or worse, a mobile phone style SoC),
> >> this *implicit topology* stuff doesn't fly, because there is no
> >> guarantee
> >> that all endpoints will always all point to the same controller.
> >
> > The local interrupts (per-CPU) are always managed by the INTC. The
> > interrupt controllers to manage device interrupts (such as PLIC) can
> > vary from platform to platform and have INTC as the parent domain.
>
> I don't know how to make it clearer: this isn't about the situation
> *today*. It is about what you will have two or five years from now.
>
> Relying on a default topology is stupidly bad, and you will end-up
> in a terrible corner eventually, because you can't have *two*
> defaults.
>
> >
> > We already have high-end interrupt controllers (such as AIA) under
> > development which are scalable for server-grade SoC, mobile SoC
> > and various other types of SoCs.
> >
> > We are able to describe the topology of different types of interrupt
> > controllers (PLIC as well as AIA) in DT.
> >
> > The only issue is for drivers which do not have dedicated DT node
> > (such as SBI IPI, SBI Timer, SBI PMU, or KVM RISC-V) but the
> > upside is that local interrupt numbers used by these drivers is
> > clearly defined by the RISC-V ISA specification:
> >
> > Here are the local interrupts defined by the RISC-V ISA spec:
> > IRQ13 => PMU overflow interrupt (used by SBI PMU driver)
> > IRQ12 => S-mode guest external interrupt (to be used by KVM RISC-V)
> > IRQ11 => M-mode external interrupt (used by firmware)
> > IRQ9 => S-mode external interrupt (used by PLIC driver)
> > IRQ7 => M-mode timer interrupt
> > IRQ5 => S-mode timer interrupt (used by SBI Timer driver)
> > IRQ3 => M-mode software interrupt (used by firmware)
> > IRQ1 => S-mode software interrupt (used by SBI IPI driver)
>
> Again, you are missing the point. It isn't about the interrupt
> number (nobody gives a crap about them). It is about the entity
> the device is connected to. No amount of copy pasting of the
> spec changes that.

The INTC CSRs are part of the RISC-V privileged specification
and maintaining backward compatibility is of utmost importance in
RISC-V ISA so I don't see why per-CPU INTC will be replaced
by something totally incompatible in future.

>
> >>
> >> >> what you should have instead is something like:
> >> >>
> >> >> static struct fwnode_handle *(*__get_root_intc_node)(void);
> >> >> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
> >> >> {
> >> >> if (__get_root_intc_node)
> >> >> return __get_root_intc_node();
> >> >>
> >> >> return NULL;
> >> >> }
> >> >>
> >> >> and the corresponding registration interface.
> >> >
> >> > Thanks, I will follow this suggestion. This is a much better approach
> >> > and it will avoid touching existing drivers.
> >> >
> >> >>
> >> >> But either way, something breaks: the INTC has one node per CPU, and
> >> >> expect one irqdomain per CPU. Having a single fwnode completely breaks
> >> >> the INTC driver (and probably the irqdomain list, as we don't check
> >> >> for
> >> >> duplicate entries).
> >> >>
> >> >> > diff --git a/drivers/irqchip/irq-riscv-intc.c
> >> >> > b/drivers/irqchip/irq-riscv-intc.c
> >> >> > index b65bd8878d4f..26ed62c11768 100644
> >> >> > --- a/drivers/irqchip/irq-riscv-intc.c
> >> >> > +++ b/drivers/irqchip/irq-riscv-intc.c
> >> >> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
> >> >> > device_node *node,
> >> >> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
> >> >> > return 0;
> >> >> >
> >> >> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> >> >> > - &riscv_intc_domain_ops, NULL);
> >> >> > + /*
> >> >> > + * Create INTC domain using a synthetic fwnode which will allow
> >> >> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
> >> >> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
> >> >> > + * directly create interrupt mapping for standard local interrupt
> >> >> > + * numbers defined by the RISC-V privileged specification.
> >> >> > + */
> >> >> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
> >> >> > + BITS_PER_LONG,
> >> >> > + &riscv_intc_domain_ops, NULL);
> >> >>
> >> >> This is what I'm talking about. It is simply broken. So either you
> >> >> don't
> >> >> need a per-CPU node (and the DT was bad the first place), or you
> >> >> absolutely need
> >> >> one (and the whole 'well-known/default domain' doesn't work at all).
> >> >>
> >> >> Either way, this patch is plain wrong.
> >> >
> >> > Okay, I will update this patch with the new approach which you
> >> > suggested.
> >>
> >> But how do you plan to work around the fact that everything is
> >> currently
> >> build around having a node (and an irqdomain) per CPU? The PLIC, for
> >> example,
> >> clearly has one parent per CPU, not one global parent.
> >>
> >> I'm sure there was a good reason for this, and I suspect merging the
> >> domains
> >> will simply end up breaking things.
> >
> > We can have multiple PLIC instances in a platform and each PLIC
> > instance targets a subset of CPUs. Further, each PLIC instance has
> > multiple PLIC contexts for associated CPUs.
> >
> > The per-CPU INTC DT nodes and the "interrupts-extended" DT
> > property of PLIC DT node helps us describe the association
> > between various PLIC contexts and CPUs.
> >
> > Here's an example PLIC DT node:
> >
> > plic: interrupt-controller@c000000 {
> > #address-cells = <0>;
> > #interrupt-cells = <1>;
> > compatible = "sifive,fu540-c000-plic", "sifive,plic-1.0.0";
> > interrupt-controller;
> > interrupts-extended = <&cpu0_intc 11>,
> > <&cpu1_intc 11>, <&cpu1_intc 9>,
> > <&cpu2_intc 11>, <&cpu2_intc 9>,
> > <&cpu3_intc 11>, <&cpu3_intc 9>,
> > <&cpu4_intc 11>, <&cpu4_intc 9>;
> > reg = <0xc000000 0x4000000>;
> > riscv,ndev = <10>;
> > };
> >
> > In above above example, PLIC has 9 contexts and context
> > to CPU connections are as follows:
> > PLIC context0 => CPU0 M-mode external interrupt
> > PLIC context1 => CPU1 M-mode external interrupt
> > PLIC context2 => CPU1 S-mode external interrupt
> > PLIC context3 => CPU2 M-mode external interrupt
> > PLIC context4 => CPU2 S-mode external interrupt
> > ....
>
> Asymmetric interrupt routing. How lovely. How broken.

This is a known issue with PLIC where contexts from the same
PLIC can target both M-mode and S-mode. This asymmetric
interrupt routing is not there in under-development AIA.

>
> >
> > This is just one example and we can describe any kind of
> > PLIC context to CPU connections using "interrupts-extended"
> > DT property.
> >
> > The same level of flexibility is provided by AIA interrupt
> > controllers which are under development.
>
> How promising. I really hope someone will eventually barge in
> and clean this mess.

The INTC DT node being per-CPU is the only strange thing
in RISC-V but we can't change INTC DT bindings at this stage.

Apart from this, I still don't see anything messy here.

Regards,
Anup

>
> M.
> --
> Jazz is not dead. It just smells funny...

2022-02-22 05:12:46

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] irqchip/riscv-intc: Create domain using named fwnode

On 2022-02-19 13:03, Anup Patel wrote:
> On Sat, Feb 19, 2022 at 3:02 PM Marc Zyngier <[email protected]> wrote:
>>
>> On 2022-02-19 03:38, Anup Patel wrote:
>> > On Thu, Feb 17, 2022 at 8:42 PM Marc Zyngier <[email protected]> wrote:
>> >>
>> >> On 2022-01-28 05:25, Anup Patel wrote:
>> >> > We should create INTC domain using a synthetic fwnode which will allow
>> >> > drivers (such as RISC-V SBI IPI driver, RISC-V timer driver, RISC-V
>> >> > PMU driver, etc) not having dedicated DT/ACPI node to directly create
>> >> > interrupt mapping for standard local interrupt numbers defined by the
>> >> > RISC-V privileged specification.
>> >> >
>> >> > Signed-off-by: Anup Patel <[email protected]>
>> >> > ---
>> >> > arch/riscv/include/asm/irq.h | 2 ++
>> >> > arch/riscv/kernel/irq.c | 13 +++++++++++++
>> >> > drivers/clocksource/timer-clint.c | 13 +++++++------
>> >> > drivers/clocksource/timer-riscv.c | 11 ++---------
>> >> > drivers/irqchip/irq-riscv-intc.c | 12 ++++++++++--
>> >> > drivers/irqchip/irq-sifive-plic.c | 19 +++++++++++--------
>> >> > 6 files changed, 45 insertions(+), 25 deletions(-)
>> >> >
>> >> > diff --git a/arch/riscv/include/asm/irq.h
>> >> > b/arch/riscv/include/asm/irq.h
>> >> > index e4c435509983..f85ebaf07505 100644
>> >> > --- a/arch/riscv/include/asm/irq.h
>> >> > +++ b/arch/riscv/include/asm/irq.h
>> >> > @@ -12,6 +12,8 @@
>> >> >
>> >> > #include <asm-generic/irq.h>
>> >> >
>> >> > +extern struct fwnode_handle *riscv_intc_fwnode(void);
>> >> > +
>> >> > extern void __init init_IRQ(void);
>> >> >
>> >> > #endif /* _ASM_RISCV_IRQ_H */
>> >> > diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>> >> > index 7207fa08d78f..f2fed78ab659 100644
>> >> > --- a/arch/riscv/kernel/irq.c
>> >> > +++ b/arch/riscv/kernel/irq.c
>> >> > @@ -7,9 +7,22 @@
>> >> >
>> >> > #include <linux/interrupt.h>
>> >> > #include <linux/irqchip.h>
>> >> > +#include <linux/irqdomain.h>
>> >> > +#include <linux/module.h>
>> >> > #include <linux/seq_file.h>
>> >> > #include <asm/smp.h>
>> >> >
>> >> > +static struct fwnode_handle *intc_fwnode;
>> >> > +
>> >> > +struct fwnode_handle *riscv_intc_fwnode(void)
>> >> > +{
>> >> > + if (!intc_fwnode)
>> >> > + intc_fwnode = irq_domain_alloc_named_fwnode("RISCV-INTC");
>> >> > +
>> >> > + return intc_fwnode;
>> >> > +}
>> >> > +EXPORT_SYMBOL_GPL(riscv_intc_fwnode);
>> >>
>> >> Why is this created outside of the root interrupt controller driver?
>> >> Furthermore, why do you need to create a new fwnode the first place?
>> >> As far as I can tell, the INTC does have a node, and what you don't
>> >> have is the firmware linkage between PMU (an others) and the INTC.
>> >
>> > Fair enough, I will update this patch to not create a synthetic fwnode.
>> >
>> > The issue is not with INTC driver. We have other drivers and places
>> > (such as SBI IPI driver, SBI PMU driver, and KVM RISC-V AIA support)
>> > where we don't have a way to locate INTC fwnode.
>>
>> And that's exactly what I am talking about: The INTC is OK (sort of),
>> but the firmware is too crap for words, and isn't even able to expose
>> where the various endpoints route their interrupts to.
>
> The firmware is always present at a higher privilege-level hence there
> is no DT node to discover the presence of firmware. The local
> interrupts
> used by the firmware for IPI, Timer and PMU are defined by the RISC-V
> ISA specification.
>
>>
>> Yes, this is probably fine today because you can describe the topology
>> of RISC-V systems on the surface of a post stamp. Once you get to the
>> complexity of a server-grade SoC (or worse, a mobile phone style SoC),
>> this *implicit topology* stuff doesn't fly, because there is no
>> guarantee
>> that all endpoints will always all point to the same controller.
>
> The local interrupts (per-CPU) are always managed by the INTC. The
> interrupt controllers to manage device interrupts (such as PLIC) can
> vary from platform to platform and have INTC as the parent domain.

I don't know how to make it clearer: this isn't about the situation
*today*. It is about what you will have two or five years from now.

Relying on a default topology is stupidly bad, and you will end-up
in a terrible corner eventually, because you can't have *two*
defaults.

>
> We already have high-end interrupt controllers (such as AIA) under
> development which are scalable for server-grade SoC, mobile SoC
> and various other types of SoCs.
>
> We are able to describe the topology of different types of interrupt
> controllers (PLIC as well as AIA) in DT.
>
> The only issue is for drivers which do not have dedicated DT node
> (such as SBI IPI, SBI Timer, SBI PMU, or KVM RISC-V) but the
> upside is that local interrupt numbers used by these drivers is
> clearly defined by the RISC-V ISA specification:
>
> Here are the local interrupts defined by the RISC-V ISA spec:
> IRQ13 => PMU overflow interrupt (used by SBI PMU driver)
> IRQ12 => S-mode guest external interrupt (to be used by KVM RISC-V)
> IRQ11 => M-mode external interrupt (used by firmware)
> IRQ9 => S-mode external interrupt (used by PLIC driver)
> IRQ7 => M-mode timer interrupt
> IRQ5 => S-mode timer interrupt (used by SBI Timer driver)
> IRQ3 => M-mode software interrupt (used by firmware)
> IRQ1 => S-mode software interrupt (used by SBI IPI driver)

Again, you are missing the point. It isn't about the interrupt
number (nobody gives a crap about them). It is about the entity
the device is connected to. No amount of copy pasting of the
spec changes that.

>>
>> >> what you should have instead is something like:
>> >>
>> >> static struct fwnode_handle *(*__get_root_intc_node)(void);
>> >> struct fwnode_handle *riscv_get_root_intc_hwnode(void)
>> >> {
>> >> if (__get_root_intc_node)
>> >> return __get_root_intc_node();
>> >>
>> >> return NULL;
>> >> }
>> >>
>> >> and the corresponding registration interface.
>> >
>> > Thanks, I will follow this suggestion. This is a much better approach
>> > and it will avoid touching existing drivers.
>> >
>> >>
>> >> But either way, something breaks: the INTC has one node per CPU, and
>> >> expect one irqdomain per CPU. Having a single fwnode completely breaks
>> >> the INTC driver (and probably the irqdomain list, as we don't check
>> >> for
>> >> duplicate entries).
>> >>
>> >> > diff --git a/drivers/irqchip/irq-riscv-intc.c
>> >> > b/drivers/irqchip/irq-riscv-intc.c
>> >> > index b65bd8878d4f..26ed62c11768 100644
>> >> > --- a/drivers/irqchip/irq-riscv-intc.c
>> >> > +++ b/drivers/irqchip/irq-riscv-intc.c
>> >> > @@ -112,8 +112,16 @@ static int __init riscv_intc_init(struct
>> >> > device_node *node,
>> >> > if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
>> >> > return 0;
>> >> >
>> >> > - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
>> >> > - &riscv_intc_domain_ops, NULL);
>> >> > + /*
>> >> > + * Create INTC domain using a synthetic fwnode which will allow
>> >> > + * drivers (such as RISC-V SBI IPI driver, RISC-V timer driver,
>> >> > + * RISC-V PMU driver, etc) not having dedicated DT/ACPI node to
>> >> > + * directly create interrupt mapping for standard local interrupt
>> >> > + * numbers defined by the RISC-V privileged specification.
>> >> > + */
>> >> > + intc_domain = irq_domain_create_linear(riscv_intc_fwnode(),
>> >> > + BITS_PER_LONG,
>> >> > + &riscv_intc_domain_ops, NULL);
>> >>
>> >> This is what I'm talking about. It is simply broken. So either you
>> >> don't
>> >> need a per-CPU node (and the DT was bad the first place), or you
>> >> absolutely need
>> >> one (and the whole 'well-known/default domain' doesn't work at all).
>> >>
>> >> Either way, this patch is plain wrong.
>> >
>> > Okay, I will update this patch with the new approach which you
>> > suggested.
>>
>> But how do you plan to work around the fact that everything is
>> currently
>> build around having a node (and an irqdomain) per CPU? The PLIC, for
>> example,
>> clearly has one parent per CPU, not one global parent.
>>
>> I'm sure there was a good reason for this, and I suspect merging the
>> domains
>> will simply end up breaking things.
>
> We can have multiple PLIC instances in a platform and each PLIC
> instance targets a subset of CPUs. Further, each PLIC instance has
> multiple PLIC contexts for associated CPUs.
>
> The per-CPU INTC DT nodes and the "interrupts-extended" DT
> property of PLIC DT node helps us describe the association
> between various PLIC contexts and CPUs.
>
> Here's an example PLIC DT node:
>
> plic: interrupt-controller@c000000 {
> #address-cells = <0>;
> #interrupt-cells = <1>;
> compatible = "sifive,fu540-c000-plic", "sifive,plic-1.0.0";
> interrupt-controller;
> interrupts-extended = <&cpu0_intc 11>,
> <&cpu1_intc 11>, <&cpu1_intc 9>,
> <&cpu2_intc 11>, <&cpu2_intc 9>,
> <&cpu3_intc 11>, <&cpu3_intc 9>,
> <&cpu4_intc 11>, <&cpu4_intc 9>;
> reg = <0xc000000 0x4000000>;
> riscv,ndev = <10>;
> };
>
> In above above example, PLIC has 9 contexts and context
> to CPU connections are as follows:
> PLIC context0 => CPU0 M-mode external interrupt
> PLIC context1 => CPU1 M-mode external interrupt
> PLIC context2 => CPU1 S-mode external interrupt
> PLIC context3 => CPU2 M-mode external interrupt
> PLIC context4 => CPU2 S-mode external interrupt
> ....

Asymmetric interrupt routing. How lovely. How broken.

>
> This is just one example and we can describe any kind of
> PLIC context to CPU connections using "interrupts-extended"
> DT property.
>
> The same level of flexibility is provided by AIA interrupt
> controllers which are under development.

How promising. I really hope someone will eventually barge in
and clean this mess.

M.
--
Jazz is not dead. It just smells funny...