The RISC-V AIA specification is now frozen as-per the RISC-V international
process. The latest frozen specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0-RC1/riscv-interrupts-1.0-RC1.pdf
At a high-level, the AIA specification adds three things:
1) AIA CSRs
- Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
- Per-HART MSI controller
- Support MSI virtualization
- Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
- Wired interrupt controller
- In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
- In Direct-mode, injects external interrupts directly into HARTs
For an overview of the AIA specification, refer the recent AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo
This series adds required Linux irqchip drivers for AIA and it depends on
the recent "RISC-V IPI Improvements".
(Refer, https://lore.kernel.org/lkml/[email protected]/t/)
To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).
These patches can also be found in the riscv_aia_v2 branch at:
https://github.com/avpatel/linux.git
Changes since v1:
- Rebased on Linux-6.2-rc2
- Addressed comments on IMSIC DT bindings for PATCH4
- Use raw_spin_lock_irqsave() on ids_lock for PATCH5
- Improved MMIO alignment checks in PATCH5 to allow MMIO regions
with holes.
- Addressed comments on APLIC DT bindings for PATCH6
- Fixed warning splat in aplic_msi_write_msg() caused by
zeroed MSI message in PATCH7
- Dropped DT property riscv,slow-ipi instead will have module
parameter in future.
Anup Patel (9):
RISC-V: Add AIA related CSR defines
RISC-V: Detect AIA CSRs from ISA string
irqchip/riscv-intc: Add support for RISC-V AIA
dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
irqchip: Add RISC-V incoming MSI controller driver
dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
irqchip: Add RISC-V advanced PLIC driver
RISC-V: Select APLIC and IMSIC drivers
MAINTAINERS: Add entry for RISC-V AIA drivers
.../interrupt-controller/riscv,aplic.yaml | 159 +++
.../interrupt-controller/riscv,imsics.yaml | 168 +++
MAINTAINERS | 12 +
arch/riscv/Kconfig | 2 +
arch/riscv/include/asm/csr.h | 92 ++
arch/riscv/include/asm/hwcap.h | 8 +
arch/riscv/kernel/cpu.c | 2 +
arch/riscv/kernel/cpufeature.c | 2 +
drivers/irqchip/Kconfig | 20 +-
drivers/irqchip/Makefile | 2 +
drivers/irqchip/irq-riscv-aplic.c | 670 ++++++++++
drivers/irqchip/irq-riscv-imsic.c | 1174 +++++++++++++++++
drivers/irqchip/irq-riscv-intc.c | 37 +-
include/linux/irqchip/riscv-aplic.h | 117 ++
include/linux/irqchip/riscv-imsic.h | 92 ++
15 files changed, 2550 insertions(+), 7 deletions(-)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
create mode 100644 drivers/irqchip/irq-riscv-aplic.c
create mode 100644 drivers/irqchip/irq-riscv-imsic.c
create mode 100644 include/linux/irqchip/riscv-aplic.h
create mode 100644 include/linux/irqchip/riscv-imsic.h
--
2.34.1
The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller
This patch adds support for RISC-V AIA in the RISC-V intc driver.
Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
1 file changed, 31 insertions(+), 6 deletions(-)
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index f229e3e66387..880d1639aadc 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -16,6 +16,7 @@
#include <linux/module.h>
#include <linux/of.h>
#include <linux/smp.h>
+#include <asm/hwcap.h>
static struct irq_domain *intc_domain;
@@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
generic_handle_domain_irq(intc_domain, cause);
}
+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+ unsigned long topi;
+
+ while ((topi = csr_read(CSR_TOPI)))
+ generic_handle_domain_irq(intc_domain,
+ topi >> TOPI_IID_SHIFT);
+}
+
/*
* On RISC-V systems local interrupts are masked or unmasked by writing
* the SIE (Supervisor Interrupt Enable) CSR. As CSRs can only be written
@@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
static void riscv_intc_irq_mask(struct irq_data *d)
{
- csr_clear(CSR_IE, BIT(d->hwirq));
+ if (d->hwirq < BITS_PER_LONG)
+ csr_clear(CSR_IE, BIT(d->hwirq));
+ else
+ csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
}
static void riscv_intc_irq_unmask(struct irq_data *d)
{
- csr_set(CSR_IE, BIT(d->hwirq));
+ if (d->hwirq < BITS_PER_LONG)
+ csr_set(CSR_IE, BIT(d->hwirq));
+ else
+ csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
}
static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
static int __init riscv_intc_init(struct device_node *node,
struct device_node *parent)
{
- int rc;
+ int rc, nr_irqs;
unsigned long hartid;
rc = riscv_of_parent_hartid(node, &hartid);
@@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
return 0;
- intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
+ nr_irqs = BITS_PER_LONG;
+ if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
+ nr_irqs = nr_irqs * 2;
+
+ intc_domain = irq_domain_add_linear(node, nr_irqs,
&riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
}
- rc = set_handle_irq(&riscv_intc_irq);
+ if (riscv_isa_extension_available(NULL, SxAIA))
+ rc = set_handle_irq(&riscv_intc_aia_irq);
+ else
+ rc = set_handle_irq(&riscv_intc_irq);
if (rc) {
pr_err("failed to set irq handler\n");
return rc;
@@ -148,7 +171,9 @@ static int __init riscv_intc_init(struct device_node *node,
riscv_set_intc_hwnode_fn(riscv_intc_hwnode);
- pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+ pr_info("%d local interrupts mapped%s\n",
+ nr_irqs, (riscv_isa_extension_available(NULL, SxAIA)) ?
+ " using AIA" : "");
return 0;
}
--
2.34.1
The RISC-V AIA specification improves handling per-HART local interrupts
in a backward compatible manner. This patch adds defines for new RISC-V
AIA CSRs.
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
1 file changed, 92 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0e571f6483d9..4e1356bad7b2 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -73,7 +73,10 @@
#define IRQ_S_EXT 9
#define IRQ_VS_EXT 10
#define IRQ_M_EXT 11
+#define IRQ_S_GEXT 12
#define IRQ_PMU_OVF 13
+#define IRQ_LOCAL_MAX (IRQ_PMU_OVF + 1)
+#define IRQ_LOCAL_MASK ((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
/* Exception causes */
#define EXC_INST_MISALIGNED 0
@@ -156,6 +159,26 @@
(_AC(1, UL) << IRQ_S_TIMER) | \
(_AC(1, UL) << IRQ_S_EXT))
+/* AIA CSR bits */
+#define TOPI_IID_SHIFT 16
+#define TOPI_IID_MASK 0xfff
+#define TOPI_IPRIO_MASK 0xff
+#define TOPI_IPRIO_BITS 8
+
+#define TOPEI_ID_SHIFT 16
+#define TOPEI_ID_MASK 0x7ff
+#define TOPEI_PRIO_MASK 0x7ff
+
+#define ISELECT_IPRIO0 0x30
+#define ISELECT_IPRIO15 0x3f
+#define ISELECT_MASK 0x1ff
+
+#define HVICTL_VTI 0x40000000
+#define HVICTL_IID 0x0fff0000
+#define HVICTL_IID_SHIFT 16
+#define HVICTL_IPRIOM 0x00000100
+#define HVICTL_IPRIO 0x000000ff
+
/* xENVCFG flags */
#define ENVCFG_STCE (_AC(1, ULL) << 63)
#define ENVCFG_PBMTE (_AC(1, ULL) << 62)
@@ -250,6 +273,18 @@
#define CSR_STIMECMP 0x14D
#define CSR_STIMECMPH 0x15D
+/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_SISELECT 0x150
+#define CSR_SIREG 0x151
+
+/* Supervisor-Level Interrupts (AIA) */
+#define CSR_STOPEI 0x15c
+#define CSR_STOPI 0xdb0
+
+/* Supervisor-Level High-Half CSRs (AIA) */
+#define CSR_SIEH 0x114
+#define CSR_SIPH 0x154
+
#define CSR_VSSTATUS 0x200
#define CSR_VSIE 0x204
#define CSR_VSTVEC 0x205
@@ -279,8 +314,32 @@
#define CSR_HGATP 0x680
#define CSR_HGEIP 0xe12
+/* Virtual Interrupts and Interrupt Priorities (H-extension with AIA) */
+#define CSR_HVIEN 0x608
+#define CSR_HVICTL 0x609
+#define CSR_HVIPRIO1 0x646
+#define CSR_HVIPRIO2 0x647
+
+/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
+#define CSR_VSISELECT 0x250
+#define CSR_VSIREG 0x251
+
+/* VS-Level Interrupts (H-extension with AIA) */
+#define CSR_VSTOPEI 0x25c
+#define CSR_VSTOPI 0xeb0
+
+/* Hypervisor and VS-Level High-Half CSRs (H-extension with AIA) */
+#define CSR_HIDELEGH 0x613
+#define CSR_HVIENH 0x618
+#define CSR_HVIPH 0x655
+#define CSR_HVIPRIO1H 0x656
+#define CSR_HVIPRIO2H 0x657
+#define CSR_VSIEH 0x214
+#define CSR_VSIPH 0x254
+
#define CSR_MSTATUS 0x300
#define CSR_MISA 0x301
+#define CSR_MIDELEG 0x303
#define CSR_MIE 0x304
#define CSR_MTVEC 0x305
#define CSR_MENVCFG 0x30a
@@ -297,6 +356,25 @@
#define CSR_MIMPID 0xf13
#define CSR_MHARTID 0xf14
+/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
+#define CSR_MISELECT 0x350
+#define CSR_MIREG 0x351
+
+/* Machine-Level Interrupts (AIA) */
+#define CSR_MTOPEI 0x35c
+#define CSR_MTOPI 0xfb0
+
+/* Virtual Interrupts for Supervisor Level (AIA) */
+#define CSR_MVIEN 0x308
+#define CSR_MVIP 0x309
+
+/* Machine-Level High-Half CSRs (AIA) */
+#define CSR_MIDELEGH 0x313
+#define CSR_MIEH 0x314
+#define CSR_MVIENH 0x318
+#define CSR_MVIPH 0x319
+#define CSR_MIPH 0x354
+
#ifdef CONFIG_RISCV_M_MODE
# define CSR_STATUS CSR_MSTATUS
# define CSR_IE CSR_MIE
@@ -307,6 +385,13 @@
# define CSR_TVAL CSR_MTVAL
# define CSR_IP CSR_MIP
+# define CSR_IEH CSR_MIEH
+# define CSR_ISELECT CSR_MISELECT
+# define CSR_IREG CSR_MIREG
+# define CSR_IPH CSR_MIPH
+# define CSR_TOPEI CSR_MTOPEI
+# define CSR_TOPI CSR_MTOPI
+
# define SR_IE SR_MIE
# define SR_PIE SR_MPIE
# define SR_PP SR_MPP
@@ -324,6 +409,13 @@
# define CSR_TVAL CSR_STVAL
# define CSR_IP CSR_SIP
+# define CSR_IEH CSR_SIEH
+# define CSR_ISELECT CSR_SISELECT
+# define CSR_IREG CSR_SIREG
+# define CSR_IPH CSR_SIPH
+# define CSR_TOPEI CSR_STOPEI
+# define CSR_TOPI CSR_STOPI
+
# define SR_IE SR_SIE
# define SR_PIE SR_SPIE
# define SR_PP SR_SPP
--
2.34.1
The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/Kconfig | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d153e1cd890b..616a27e43827 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -127,6 +127,8 @@ config RISCV
select OF_IRQ
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
+ select RISCV_APLIC
+ select RISCV_IMSIC
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
--
2.34.1
The RISC-V advanced interrupt architecture (AIA) specification defines
a new MSI controller for managing MSIs on a RISC-V platform. This new
MSI controller is referred to as incoming message signaled interrupt
controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
(For more details refer https://github.com/riscv/riscv-aia)
This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
platforms.
Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 14 +-
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-imsic.c | 1174 +++++++++++++++++++++++++++
include/linux/irqchip/riscv-imsic.h | 92 +++
4 files changed, 1280 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-imsic.c
create mode 100644 include/linux/irqchip/riscv-imsic.h
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 9e65345ca3f6..a1315189a595 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -29,7 +29,6 @@ config ARM_GIC_V2M
config GIC_NON_BANKED
bool
-
config ARM_GIC_V3
bool
select IRQ_DOMAIN_HIERARCHY
@@ -548,6 +547,19 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
+config RISCV_IMSIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_MSI_IRQ_DOMAIN
+
+config RISCV_IMSIC_PCI
+ bool
+ depends on RISCV_IMSIC
+ depends on PCI
+ depends on PCI_MSI
+ default RISCV_IMSIC
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 87b49a10962c..22c723cc6ec8 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
new file mode 100644
index 000000000000..4c16b66738d6
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -0,0 +1,1174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#define IMSIC_DISABLE_EIDELIVERY 0
+#define IMSIC_ENABLE_EIDELIVERY 1
+#define IMSIC_DISABLE_EITHRESHOLD 1
+#define IMSIC_ENABLE_EITHRESHOLD 0
+
+#define imsic_csr_write(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_write(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_read(__c) \
+({ \
+ unsigned long __v; \
+ csr_write(CSR_ISELECT, __c); \
+ __v = csr_read(CSR_IREG); \
+ __v; \
+})
+
+#define imsic_csr_set(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_set(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_clear(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_clear(CSR_IREG, __v); \
+} while (0)
+
+struct imsic_mmio {
+ phys_addr_t pa;
+ void __iomem *va;
+ unsigned long size;
+};
+
+struct imsic_priv {
+ /* Global configuration common for all HARTs */
+ struct imsic_global_config global;
+
+ /* MMIO regions */
+ u32 num_mmios;
+ struct imsic_mmio *mmios;
+
+ /* Global state of interrupt identities */
+ raw_spinlock_t ids_lock;
+ unsigned long *ids_used_bimap;
+ unsigned long *ids_enabled_bimap;
+ unsigned int *ids_target_cpu;
+
+ /* Mask for connected CPUs */
+ struct cpumask lmask;
+
+ /* IPI interrupt identity */
+ u32 ipi_id;
+ u32 ipi_lsync_id;
+
+ /* IRQ domains */
+ struct irq_domain *base_domain;
+ struct irq_domain *pci_domain;
+ struct irq_domain *plat_domain;
+};
+
+struct imsic_handler {
+ /* Local configuration for given HART */
+ struct imsic_local_config local;
+
+ /* Pointer to private context */
+ struct imsic_priv *priv;
+};
+
+static bool imsic_init_done;
+
+static int imsic_parent_irq;
+static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+ struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+
+ if (!handler || !handler->priv)
+ return NULL;
+
+ return &handler->priv->global;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
+{
+ struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+ if (!handler || !handler->priv)
+ return NULL;
+
+ return &handler->local;
+}
+EXPORT_SYMBOL_GPL(imsic_get_local_config);
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+ unsigned int guest_index,
+ phys_addr_t *out_msi_pa)
+{
+ struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+ struct imsic_global_config *global;
+ struct imsic_local_config *local;
+
+ if (!handler || !handler->priv)
+ return -ENODEV;
+ local = &handler->local;
+ global = &handler->priv->global;
+
+ if (BIT(global->guest_index_bits) <= guest_index)
+ return -EINVAL;
+
+ if (out_msi_pa)
+ *out_msi_pa = local->msi_pa +
+ (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+ return 0;
+}
+
+static int imsic_get_cpu(struct imsic_priv *priv,
+ const struct cpumask *mask_val, bool force,
+ unsigned int *out_target_cpu)
+{
+ struct cpumask amask;
+ unsigned int cpu;
+
+ cpumask_and(&amask, &priv->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ if (out_target_cpu)
+ *out_target_cpu = cpu;
+
+ return 0;
+}
+
+static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
+ struct msi_msg *msg)
+{
+ phys_addr_t msi_addr;
+ int err;
+
+ err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ if (err)
+ return err;
+
+ msg->address_hi = upper_32_bits(msi_addr);
+ msg->address_lo = lower_32_bits(msi_addr);
+ msg->data = id;
+
+ return err;
+}
+
+static void imsic_id_set_target(struct imsic_priv *priv,
+ unsigned int id, unsigned int target_cpu)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ priv->ids_target_cpu[id] = target_cpu;
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static unsigned int imsic_id_get_target(struct imsic_priv *priv,
+ unsigned int id)
+{
+ unsigned int ret;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ ret = priv->ids_target_cpu[id];
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+ return ret;
+}
+
+static void __imsic_eix_update(unsigned long base_id,
+ unsigned long num_id, bool pend, bool val)
+{
+ unsigned long i, isel, ireg, flags;
+ unsigned long id = base_id, last_id = base_id + num_id;
+
+ while (id < last_id) {
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+ ireg = 0;
+ for (i = id & (__riscv_xlen - 1);
+ (id < last_id) && (i < __riscv_xlen); i++) {
+ ireg |= BIT(i);
+ id++;
+ }
+
+ /*
+ * The IMSIC EIEx and EIPx registers are indirectly
+ * accessed via using ISELECT and IREG CSRs so we
+ * save/restore local IRQ to ensure that we don't
+ * get preempted while accessing IMSIC registers.
+ */
+ local_irq_save(flags);
+ if (val)
+ imsic_csr_set(isel, ireg);
+ else
+ imsic_csr_clear(isel, ireg);
+ local_irq_restore(flags);
+ }
+}
+
+#define __imsic_id_enable(__id) \
+ __imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_disable(__id) \
+ __imsic_eix_update((__id), 1, false, false)
+
+#ifdef CONFIG_SMP
+static void __imsic_id_smp_sync(struct imsic_priv *priv)
+{
+ struct imsic_handler *handler;
+ struct cpumask amask;
+ int cpu;
+
+ cpumask_and(&amask, &priv->lmask, cpu_online_mask);
+ for_each_cpu(cpu, &amask) {
+ if (cpu == smp_processor_id())
+ continue;
+
+ handler = per_cpu_ptr(&imsic_handlers, cpu);
+ if (!handler || !handler->priv || !handler->local.msi_va) {
+ pr_warn("CPU%d: handler not initialized\n", cpu);
+ continue;
+ }
+
+ writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
+ }
+}
+#else
+#define __imsic_id_smp_sync(__priv)
+#endif
+
+static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ bitmap_set(priv->ids_enabled_bimap, id, 1);
+ __imsic_id_enable(id);
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+ __imsic_id_smp_sync(priv);
+}
+
+static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ bitmap_clear(priv->ids_enabled_bimap, id, 1);
+ __imsic_id_disable(id);
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+ __imsic_id_smp_sync(priv);
+}
+
+static void imsic_ids_local_sync(struct imsic_priv *priv)
+{
+ int i;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ for (i = 1; i <= priv->global.nr_ids; i++) {
+ if (priv->ipi_id == i || priv->ipi_lsync_id == i)
+ continue;
+
+ if (test_bit(i, priv->ids_enabled_bimap))
+ __imsic_id_enable(i);
+ else
+ __imsic_id_disable(i);
+ }
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
+{
+ if (enable) {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+ } else {
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+ }
+}
+
+static int imsic_ids_alloc(struct imsic_priv *priv,
+ unsigned int max_id, unsigned int order)
+{
+ int ret;
+ unsigned long flags;
+
+ if ((priv->global.nr_ids < max_id) ||
+ (max_id < BIT(order)))
+ return -EINVAL;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ ret = bitmap_find_free_region(priv->ids_used_bimap,
+ max_id + 1, order);
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+
+ return ret;
+}
+
+static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
+ unsigned int order)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&priv->ids_lock, flags);
+ bitmap_release_region(priv->ids_used_bimap, base_id, order);
+ raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
+}
+
+static int __init imsic_ids_init(struct imsic_priv *priv)
+{
+ int i;
+ struct imsic_global_config *global = &priv->global;
+
+ raw_spin_lock_init(&priv->ids_lock);
+
+ /* Allocate used bitmap */
+ priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+ sizeof(unsigned long), GFP_KERNEL);
+ if (!priv->ids_used_bimap)
+ return -ENOMEM;
+
+ /* Allocate enabled bitmap */
+ priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
+ sizeof(unsigned long), GFP_KERNEL);
+ if (!priv->ids_enabled_bimap) {
+ kfree(priv->ids_used_bimap);
+ return -ENOMEM;
+ }
+
+ /* Allocate target CPU array */
+ priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
+ sizeof(unsigned int), GFP_KERNEL);
+ if (!priv->ids_target_cpu) {
+ kfree(priv->ids_enabled_bimap);
+ kfree(priv->ids_used_bimap);
+ return -ENOMEM;
+ }
+ for (i = 0; i <= global->nr_ids; i++)
+ priv->ids_target_cpu[i] = UINT_MAX;
+
+ /* Reserve ID#0 because it is special and never implemented */
+ bitmap_set(priv->ids_used_bimap, 0, 1);
+
+ return 0;
+}
+
+static void __init imsic_ids_cleanup(struct imsic_priv *priv)
+{
+ kfree(priv->ids_target_cpu);
+ kfree(priv->ids_enabled_bimap);
+ kfree(priv->ids_used_bimap);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+ struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
+
+ if (!handler || !handler->priv || !handler->local.msi_va) {
+ pr_warn("CPU%d: handler not initialized\n", cpu);
+ return;
+ }
+
+ writel(handler->priv->ipi_id, handler->local.msi_va);
+}
+
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+ __imsic_id_enable(priv->ipi_id);
+ __imsic_id_enable(priv->ipi_lsync_id);
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+ int virq;
+
+ /* Allocate interrupt identity for IPIs */
+ virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+ if (virq < 0)
+ return virq;
+ priv->ipi_id = virq;
+
+ /* Create IMSIC IPI multiplexing */
+ virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
+ if (virq <= 0) {
+ imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+ return (virq < 0) ? virq : -ENOMEM;
+ }
+
+ /* Set vIRQ range */
+ riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
+
+ /* Allocate interrupt identity for local enable/disable sync */
+ virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
+ if (virq < 0) {
+ imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+ return virq;
+ }
+ priv->ipi_lsync_id = virq;
+
+ return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+ imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
+ if (priv->ipi_id)
+ imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
+}
+#else
+static void imsic_ipi_enable(struct imsic_priv *priv)
+{
+}
+
+static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
+{
+ /* Clear the IPI ids because we are not using IPIs */
+ priv->ipi_id = 0;
+ priv->ipi_lsync_id = 0;
+ return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
+{
+}
+#endif
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+ imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+ imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
+}
+
+static void imsic_irq_compose_msi_msg(struct irq_data *d,
+ struct msi_msg *msg)
+{
+ struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+ unsigned int cpu;
+ int err;
+
+ cpu = imsic_id_get_target(priv, d->hwirq);
+ WARN_ON(cpu == UINT_MAX);
+
+ err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
+ WARN_ON(err);
+
+ iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
+}
+
+#ifdef CONFIG_SMP
+static int imsic_irq_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val,
+ bool force)
+{
+ struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
+ unsigned int target_cpu;
+ int rc;
+
+ rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
+ if (rc)
+ return rc;
+
+ imsic_id_set_target(priv, d->hwirq, target_cpu);
+ irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
+
+ return IRQ_SET_MASK_OK;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+ .name = "RISC-V IMSIC-BASE",
+ .irq_mask = imsic_irq_mask,
+ .irq_unmask = imsic_irq_unmask,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = imsic_irq_set_affinity,
+#endif
+ .irq_compose_msi_msg = imsic_irq_compose_msi_msg,
+ .flags = IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs,
+ void *args)
+{
+ struct imsic_priv *priv = domain->host_data;
+ msi_alloc_info_t *info = args;
+ phys_addr_t msi_addr;
+ int i, hwirq, err = 0;
+ unsigned int cpu;
+
+ err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
+ if (err)
+ return err;
+
+ err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ if (err)
+ return err;
+
+ hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
+ get_count_order(nr_irqs));
+ if (hwirq < 0)
+ return hwirq;
+
+ err = iommu_dma_prepare_msi(info->desc, msi_addr);
+ if (err)
+ goto fail;
+
+ for (i = 0; i < nr_irqs; i++) {
+ imsic_id_set_target(priv, hwirq + i, cpu);
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &imsic_irq_base_chip, priv,
+ handle_simple_irq, NULL, NULL);
+ irq_set_noprobe(virq + i);
+ irq_set_affinity(virq + i, &priv->lmask);
+ }
+
+ return 0;
+
+fail:
+ imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
+ return err;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+ struct imsic_priv *priv = domain->host_data;
+
+ imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
+ irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+ .alloc = imsic_irq_domain_alloc,
+ .free = imsic_irq_domain_free,
+};
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+ pci_msi_unmask_irq(d);
+ irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+ .name = "RISC-V IMSIC-PCI",
+ .irq_mask = imsic_pci_mask_irq,
+ .irq_unmask = imsic_pci_unmask_irq,
+ .irq_eoi = irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+ .ops = &imsic_pci_domain_ops,
+ .chip = &imsic_pci_irq_chip,
+};
+
+#endif
+
+static struct irq_chip imsic_plat_irq_chip = {
+ .name = "RISC-V IMSIC-PLAT",
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+ .ops = &imsic_plat_domain_ops,
+ .chip = &imsic_plat_irq_chip,
+};
+
+static int __init imsic_irq_domains_init(struct imsic_priv *priv,
+ struct fwnode_handle *fwnode)
+{
+ /* Create Base IRQ domain */
+ priv->base_domain = irq_domain_create_tree(fwnode,
+ &imsic_base_domain_ops, priv);
+ if (!priv->base_domain) {
+ pr_err("Failed to create IMSIC base domain\n");
+ return -ENOMEM;
+ }
+ irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
+
+#ifdef CONFIG_RISCV_IMSIC_PCI
+ /* Create PCI MSI domain */
+ priv->pci_domain = pci_msi_create_irq_domain(fwnode,
+ &imsic_pci_domain_info,
+ priv->base_domain);
+ if (!priv->pci_domain) {
+ pr_err("Failed to create IMSIC PCI domain\n");
+ irq_domain_remove(priv->base_domain);
+ return -ENOMEM;
+ }
+#endif
+
+ /* Create Platform MSI domain */
+ priv->plat_domain = platform_msi_create_irq_domain(fwnode,
+ &imsic_plat_domain_info,
+ priv->base_domain);
+ if (!priv->plat_domain) {
+ pr_err("Failed to create IMSIC platform domain\n");
+ if (priv->pci_domain)
+ irq_domain_remove(priv->pci_domain);
+ irq_domain_remove(priv->base_domain);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+ struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct imsic_priv *priv = handler->priv;
+ irq_hw_number_t hwirq;
+ int err;
+
+ WARN_ON_ONCE(!handler->priv);
+
+ chained_irq_enter(chip, desc);
+
+ while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
+ hwirq = hwirq >> TOPEI_ID_SHIFT;
+
+ if (hwirq == priv->ipi_id) {
+#ifdef CONFIG_SMP
+ ipi_mux_process();
+#endif
+ continue;
+ } else if (hwirq == priv->ipi_lsync_id) {
+ imsic_ids_local_sync(priv);
+ continue;
+ }
+
+ err = generic_handle_domain_irq(priv->base_domain, hwirq);
+ if (unlikely(err))
+ pr_warn_ratelimited(
+ "hwirq %lu mapping not found\n", hwirq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+ struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
+ struct imsic_priv *priv = handler->priv;
+
+ /* Enable per-CPU parent interrupt */
+ if (imsic_parent_irq)
+ enable_percpu_irq(imsic_parent_irq,
+ irq_get_trigger_type(imsic_parent_irq));
+ else
+ pr_warn("cpu%d: parent irq not available\n", cpu);
+
+ /* Enable IPIs */
+ imsic_ipi_enable(priv);
+
+ /*
+ * Interrupts identities might have been enabled/disabled while
+ * this CPU was not running so sync-up local enable/disable state.
+ */
+ imsic_ids_local_sync(priv);
+
+ /* Locally enable interrupt delivery */
+ imsic_ids_local_delivery(priv, true);
+
+ return 0;
+}
+
+struct imsic_fwnode_ops {
+ u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
+ void *fwopaque);
+ int (*parent_hartid)(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index,
+ unsigned long *out_hartid);
+ u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
+ int (*mmio_to_resource)(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index,
+ struct resource *res);
+ void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index);
+ int (*read_u32)(struct fwnode_handle *fwnode,
+ void *fwopaque, const char *prop, u32 *out_val);
+ bool (*read_bool)(struct fwnode_handle *fwnode,
+ void *fwopaque, const char *prop);
+};
+
+static int __init imsic_init(struct imsic_fwnode_ops *fwops,
+ struct fwnode_handle *fwnode,
+ void *fwopaque)
+{
+ struct resource res;
+ phys_addr_t base_addr;
+ int rc, nr_parent_irqs;
+ struct imsic_mmio *mmio;
+ struct imsic_priv *priv;
+ struct irq_domain *domain;
+ struct imsic_handler *handler;
+ struct imsic_global_config *global;
+ u32 i, tmp, nr_handlers = 0;
+
+ if (imsic_init_done) {
+ pr_err("%pfwP: already initialized hence ignoring\n",
+ fwnode);
+ return -ENODEV;
+ }
+
+ if (!riscv_isa_extension_available(NULL, SxAIA)) {
+ pr_err("%pfwP: AIA support not available\n", fwnode);
+ return -ENODEV;
+ }
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+ global = &priv->global;
+
+ /* Find number of parent interrupts */
+ nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
+ if (!nr_parent_irqs) {
+ pr_err("%pfwP: no parent irqs available\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of guest index bits in MSI address */
+ rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
+ &global->guest_index_bits);
+ if (rc)
+ global->guest_index_bits = 0;
+ tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+ if (tmp < global->guest_index_bits) {
+ pr_err("%pfwP: guest index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of HART index bits */
+ rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
+ &global->hart_index_bits);
+ if (rc) {
+ /* Assume default value */
+ global->hart_index_bits = __fls(nr_parent_irqs);
+ if (BIT(global->hart_index_bits) < nr_parent_irqs)
+ global->hart_index_bits++;
+ }
+ tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+ global->guest_index_bits;
+ if (tmp < global->hart_index_bits) {
+ pr_err("%pfwP: HART index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of group index bits */
+ rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
+ &global->group_index_bits);
+ if (rc)
+ global->group_index_bits = 0;
+ tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+ global->guest_index_bits - global->hart_index_bits;
+ if (tmp < global->group_index_bits) {
+ pr_err("%pfwP: group index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /*
+ * Find first bit position of group index.
+ * If not specified assumed the default APLIC-IMSIC configuration.
+ */
+ rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
+ &global->group_index_shift);
+ if (rc)
+ global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+ tmp = global->group_index_bits + global->group_index_shift - 1;
+ if (tmp >= BITS_PER_LONG) {
+ pr_err("%pfwP: group index shift too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of interrupt identities */
+ rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
+ &global->nr_ids);
+ if (rc) {
+ pr_err("%pfwP: number of interrupt identities not found\n",
+ fwnode);
+ return rc;
+ }
+ if ((global->nr_ids < IMSIC_MIN_ID) ||
+ (global->nr_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of guest interrupt identities */
+ if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
+ &global->nr_guest_ids))
+ global->nr_guest_ids = global->nr_ids;
+ if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+ (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of guest interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Compute base address */
+ rc = fwops->mmio_to_resource(fwnode, fwopaque, 0, &res);
+ if (rc) {
+ pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+ return -EINVAL;
+ }
+ global->base_addr = res.start;
+ global->base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+
+ /* Find number of MMIO register sets */
+ priv->num_mmios = fwops->nr_mmio(fwnode, fwopaque);
+
+ /* Allocate MMIO register sets */
+ priv->mmios = kcalloc(priv->num_mmios, sizeof(*mmio), GFP_KERNEL);
+ if (!priv->mmios) {
+ rc = -ENOMEM;
+ goto out_free_priv;
+ }
+
+ /* Parse and map MMIO register sets */
+ for (i = 0; i < priv->num_mmios; i++) {
+ mmio = &priv->mmios[i];
+ rc = fwops->mmio_to_resource(fwnode, fwopaque, i, &res);
+ if (rc) {
+ pr_err("%pfwP: unable to parse MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+ mmio->pa = res.start;
+ mmio->size = res.end - res.start + 1;
+
+ base_addr = mmio->pa;
+ base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+ if (base_addr != global->base_addr) {
+ rc = -EINVAL;
+ pr_err("%pfwP: address mismatch for regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ mmio->va = fwops->mmio_map(fwnode, fwopaque, i);
+ if (!mmio->va) {
+ rc = -EIO;
+ pr_err("%pfwP: unable to map MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+ }
+
+ /* Initialize interrupt identity management */
+ rc = imsic_ids_init(priv);
+ if (rc) {
+ pr_err("%pfwP: failed to initialize interrupt management\n",
+ fwnode);
+ goto out_iounmap;
+ }
+
+ /* Configure handlers for target CPUs */
+ for (i = 0; i < nr_parent_irqs; i++) {
+ unsigned long reloff, hartid;
+ int j, cpu;
+
+ rc = fwops->parent_hartid(fwnode, fwopaque, i, &hartid);
+ if (rc) {
+ pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+ fwnode, i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ /* Find MMIO location of MSI page */
+ mmio = NULL;
+ reloff = i * BIT(global->guest_index_bits) *
+ IMSIC_MMIO_PAGE_SZ;
+ for (j = 0; priv->num_mmios; j++) {
+ if (reloff < priv->mmios[j].size) {
+ mmio = &priv->mmios[j];
+ break;
+ }
+
+ /*
+ * MMIO region size may not be aligned to
+ * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+ * if holes are present.
+ */
+ reloff -= ALIGN(priv->mmios[j].size,
+ BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+ }
+ if (!mmio) {
+ pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ handler = per_cpu_ptr(&imsic_handlers, cpu);
+ if (handler->priv) {
+ pr_warn("%pfwP: CPU%d handler already configured.\n",
+ fwnode, cpu);
+ goto done;
+ }
+
+ cpumask_set_cpu(cpu, &priv->lmask);
+ handler->local.msi_pa = mmio->pa + reloff;
+ handler->local.msi_va = mmio->va + reloff;
+ handler->priv = priv;
+
+done:
+ nr_handlers++;
+ }
+
+ /* If no CPU handlers found then can't take interrupts */
+ if (!nr_handlers) {
+ pr_err("%pfwP: No CPU handlers found\n", fwnode);
+ rc = -ENODEV;
+ goto out_ids_cleanup;
+ }
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!domain) {
+ pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+ rc = -ENOENT;
+ goto out_ids_cleanup;
+ }
+ imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (!imsic_parent_irq) {
+ pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+ rc = -ENOENT;
+ goto out_ids_cleanup;
+ }
+ irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+ /* Initialize IPI domain */
+ rc = imsic_ipi_domain_init(priv);
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+ goto out_ids_cleanup;
+ }
+
+ /* Initialize IRQ and MSI domains */
+ rc = imsic_irq_domains_init(priv, fwnode);
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
+ fwnode);
+ goto out_ipi_domain_cleanup;
+ }
+
+ /*
+ * Setup cpuhp state
+ *
+ * Don't disable per-CPU IMSIC file when CPU goes offline
+ * because this affects IPI and the masking/unmasking of
+ * virtual IPIs is done via generic IPI-Mux
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/imsic:starting",
+ imsic_starting_cpu, NULL);
+
+ /*
+ * Only one IMSIC instance allowed in a platform for clean
+ * implementation of SMP IRQ affinity and per-CPU IPIs.
+ *
+ * This means on a multi-socket (or multi-die) platform we
+ * will have multiple MMIO regions for one IMSIC instance.
+ */
+ imsic_init_done = true;
+
+ pr_info("%pfwP: hart-index-bits: %d, guest-index-bits: %d\n",
+ fwnode, global->hart_index_bits, global->guest_index_bits);
+ pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+ fwnode, global->group_index_bits, global->group_index_shift);
+ pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
+ fwnode, global->nr_ids, nr_handlers, &global->base_addr);
+ if (priv->ipi_lsync_id)
+ pr_info("%pfwP: enable/disable sync using interrupt %d\n",
+ fwnode, priv->ipi_lsync_id);
+ if (priv->ipi_id)
+ pr_info("%pfwP: providing IPIs using interrupt %d\n",
+ fwnode, priv->ipi_id);
+
+ return 0;
+
+out_ipi_domain_cleanup:
+ imsic_ipi_domain_cleanup(priv);
+out_ids_cleanup:
+ imsic_ids_cleanup(priv);
+out_iounmap:
+ for (i = 0; i < priv->num_mmios; i++) {
+ if (priv->mmios[i].va)
+ iounmap(priv->mmios[i].va);
+ }
+ kfree(priv->mmios);
+out_free_priv:
+ kfree(priv);
+ return rc;
+}
+
+static u32 __init imsic_dt_nr_parent_irq(struct fwnode_handle *fwnode,
+ void *fwopaque)
+{
+ return of_irq_count(to_of_node(fwnode));
+}
+
+static int __init imsic_dt_parent_hartid(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index,
+ unsigned long *out_hartid)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
+ if (rc)
+ return rc;
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (parent.args[0] != RV_IRQ_EXT)
+ return -EINVAL;
+
+ return riscv_of_parent_hartid(parent.np, out_hartid);
+}
+
+static u32 __init imsic_dt_nr_mmio(struct fwnode_handle *fwnode,
+ void *fwopaque)
+{
+ u32 ret = 0;
+ struct resource res;
+
+ while (!of_address_to_resource(to_of_node(fwnode), ret, &res))
+ ret++;
+
+ return ret;
+}
+
+static int __init imsic_mmio_to_resource(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index,
+ struct resource *res)
+{
+ return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static void __iomem __init *imsic_dt_mmio_map(struct fwnode_handle *fwnode,
+ void *fwopaque, u32 index)
+{
+ return of_iomap(to_of_node(fwnode), index);
+}
+
+static int __init imsic_dt_read_u32(struct fwnode_handle *fwnode,
+ void *fwopaque, const char *prop,
+ u32 *out_val)
+{
+ return of_property_read_u32(to_of_node(fwnode), prop, out_val);
+}
+
+static bool __init imsic_dt_read_bool(struct fwnode_handle *fwnode,
+ void *fwopaque, const char *prop)
+{
+ return of_property_read_bool(to_of_node(fwnode), prop);
+}
+
+static int __init imsic_dt_init(struct device_node *node,
+ struct device_node *parent)
+{
+ struct imsic_fwnode_ops ops = {
+ .nr_parent_irq = imsic_dt_nr_parent_irq,
+ .parent_hartid = imsic_dt_parent_hartid,
+ .nr_mmio = imsic_dt_nr_mmio,
+ .mmio_to_resource = imsic_mmio_to_resource,
+ .mmio_map = imsic_dt_mmio_map,
+ .read_u32 = imsic_dt_read_u32,
+ .read_bool = imsic_dt_read_bool,
+ };
+
+ return imsic_init(&ops, &node->fwnode, NULL);
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..5d1387adc0ba
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT 12
+#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE 0x00
+#define IMSIC_MMIO_PAGE_BE 0x04
+
+#define IMSIC_MIN_ID 63
+#define IMSIC_MAX_ID 2048
+
+#define IMSIC_EIDELIVERY 0x70
+
+#define IMSIC_EITHRESHOLD 0x72
+
+#define IMSIC_EIP0 0x80
+#define IMSIC_EIP63 0xbf
+#define IMSIC_EIPx_BITS 32
+
+#define IMSIC_EIE0 0xc0
+#define IMSIC_EIE63 0xff
+#define IMSIC_EIEx_BITS 32
+
+#define IMSIC_FIRST IMSIC_EIDELIVERY
+#define IMSIC_LAST IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE 0x00
+#define IMSIC_MMIO_SETIPNUM_BE 0x04
+
+struct imsic_global_config {
+ /*
+ * MSI Target Address Scheme
+ *
+ * XLEN-1 12 0
+ * | | |
+ * -------------------------------------------------------------
+ * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ * -------------------------------------------------------------
+ */
+
+ /* Bits representing Guest index, HART index, and Group index */
+ u32 guest_index_bits;
+ u32 hart_index_bits;
+ u32 group_index_bits;
+ u32 group_index_shift;
+
+ /* Global base address matching all target MSI addresses */
+ phys_addr_t base_addr;
+
+ /* Number of interrupt identities */
+ u32 nr_ids;
+
+ /* Number of guest interrupt identities */
+ u32 nr_guest_ids;
+};
+
+struct imsic_local_config {
+ phys_addr_t msi_pa;
+ void __iomem *msi_va;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+extern const struct imsic_local_config *imsic_get_local_config(
+ unsigned int cpu);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return NULL;
+}
+
+static inline const struct imsic_local_config *imsic_get_local_config(
+ unsigned int cpu)
+{
+ return NULL;
+}
+
+#endif
+
+#endif
--
2.34.1
The RISC-V advanced interrupt architecture (AIA) specification defines
a new interrupt controller for managing wired interrupts on a RISC-V
platform. This new interrupt controller is referred to as advanced
platform-level interrupt controller (APLIC) which can forward wired
interrupts to CPUs (or HARTs) as local interrupts OR as message
signaled interrupts.
(For more details refer https://github.com/riscv/riscv-aia)
This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
platforms.
Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 6 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic.c | 670 ++++++++++++++++++++++++++++
include/linux/irqchip/riscv-aplic.h | 117 +++++
4 files changed, 794 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-aplic.c
create mode 100644 include/linux/irqchip/riscv-aplic.h
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index a1315189a595..936e59fe1f99 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -547,6 +547,12 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
+config RISCV_APLIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_MSI_IRQ_DOMAIN
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 22c723cc6ec8..6154e5bc4228 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
new file mode 100644
index 000000000000..63f20892d7d3
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -0,0 +1,670 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#define APLIC_DEFAULT_PRIORITY 1
+#define APLIC_DISABLE_IDELIVERY 0
+#define APLIC_ENABLE_IDELIVERY 1
+#define APLIC_DISABLE_ITHRESHOLD 1
+#define APLIC_ENABLE_ITHRESHOLD 0
+
+struct aplic_msicfg {
+ phys_addr_t base_ppn;
+ u32 hhxs;
+ u32 hhxw;
+ u32 lhxs;
+ u32 lhxw;
+};
+
+struct aplic_idc {
+ unsigned int hart_index;
+ void __iomem *regs;
+ struct aplic_priv *priv;
+};
+
+struct aplic_priv {
+ struct device *dev;
+ u32 nr_irqs;
+ u32 nr_idcs;
+ void __iomem *regs;
+ struct irq_domain *irqdomain;
+ struct aplic_msicfg msicfg;
+ struct cpumask lmask;
+};
+
+static unsigned int aplic_idc_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_irq_unmask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+
+ if (!priv->nr_idcs)
+ irq_chip_unmask_parent(d);
+}
+
+static void aplic_irq_mask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+
+ if (!priv->nr_idcs)
+ irq_chip_mask_parent(d);
+}
+
+static int aplic_set_type(struct irq_data *d, unsigned int type)
+{
+ u32 val = 0;
+ void __iomem *sourcecfg;
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ switch (type) {
+ case IRQ_TYPE_NONE:
+ val = APLIC_SOURCECFG_SM_INACTIVE;
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ val = APLIC_SOURCECFG_SM_EDGE_FALL;
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ val = APLIC_SOURCECFG_SM_EDGE_RISE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+ sourcecfg += (d->hwirq - 1) * sizeof(u32);
+ writel(val, sourcecfg);
+
+ return 0;
+}
+
+#ifdef CONFIG_SMP
+static int aplic_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val, bool force)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_idc *idc;
+ unsigned int cpu, val;
+ struct cpumask amask;
+ void __iomem *target;
+
+ if (!priv->nr_idcs)
+ return irq_chip_set_affinity_parent(d, mask_val, force);
+
+ cpumask_and(&amask, &priv->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ writel(val, target);
+
+ irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_chip = {
+ .name = "RISC-V APLIC",
+ .irq_mask = aplic_irq_mask,
+ .irq_unmask = aplic_irq_unmask,
+ .irq_set_type = aplic_set_type,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = aplic_set_affinity,
+#endif
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_irqdomain_translate(struct irq_domain *d,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ if (WARN_ON(fwspec->param_count < 2))
+ return -EINVAL;
+ if (WARN_ON(!fwspec->param[0]))
+ return -EINVAL;
+
+ *hwirq = fwspec->param[0];
+ *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+ WARN_ON(*type == IRQ_TYPE_NONE);
+
+ return 0;
+}
+
+static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+ ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++)
+ irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+ &aplic_chip, priv);
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
+ .translate = aplic_irqdomain_translate,
+ .alloc = aplic_irqdomain_msi_alloc,
+ .free = platform_msi_device_domain_free,
+};
+
+static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = domain->host_data;
+
+ ret = aplic_irqdomain_translate(domain, fwspec, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &aplic_chip, priv, handle_simple_irq,
+ NULL, NULL);
+ irq_set_affinity(virq + i, &priv->lmask);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
+ .translate = aplic_irqdomain_translate,
+ .alloc = aplic_irqdomain_idc_alloc,
+ .free = irq_domain_free_irqs_top,
+};
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+ int i;
+
+ /* Disable all interrupts */
+ for (i = 0; i <= priv->nr_irqs; i += 32)
+ writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+ (i / 32) * sizeof(u32));
+
+ /* Set interrupt type and default priority for all interrupts */
+ for (i = 1; i <= priv->nr_irqs; i++) {
+ writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+ (i - 1) * sizeof(u32));
+ writel(APLIC_DEFAULT_PRIORITY,
+ priv->regs + APLIC_TARGET_BASE +
+ (i - 1) * sizeof(u32));
+ }
+
+ /* Clear APLIC domaincfg */
+ writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+static void aplic_init_hw_global(struct aplic_priv *priv)
+{
+ u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+ u32 valH;
+
+ if (!priv->nr_idcs) {
+ val = priv->msicfg.base_ppn;
+ valH = (priv->msicfg.base_ppn >> 32) &
+ APLIC_xMSICFGADDRH_BAPPN_MASK;
+ valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+ << APLIC_xMSICFGADDRH_LHXW_SHIFT;
+ valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+ << APLIC_xMSICFGADDRH_HHXW_SHIFT;
+ valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+ << APLIC_xMSICFGADDRH_LHXS_SHIFT;
+ valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+ << APLIC_xMSICFGADDRH_HHXS_SHIFT;
+ writel(val, priv->regs + APLIC_xMSICFGADDR);
+ writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+ }
+#endif
+
+ /* Setup APLIC domaincfg register */
+ val = readl(priv->regs + APLIC_DOMAINCFG);
+ val |= APLIC_DOMAINCFG_IE;
+ if (!priv->nr_idcs)
+ val |= APLIC_DOMAINCFG_DM;
+ writel(val, priv->regs + APLIC_DOMAINCFG);
+ if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+ dev_warn(priv->dev,
+ "unable to write 0x%x in domaincfg\n", val);
+}
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+ unsigned int group_index, hart_index, guest_index, val;
+ struct device *dev = msi_desc_to_dev(desc);
+ struct aplic_priv *priv = dev_get_drvdata(dev);
+ struct irq_data *d = irq_get_irq_data(desc->irq);
+ struct aplic_msicfg *mc = &priv->msicfg;
+ phys_addr_t tppn, tbppn, msg_addr;
+ void __iomem *target;
+
+ /* For zeroed MSI, simply write zero into the target register */
+ if (!msg->address_hi && !msg->address_lo && !msg->data) {
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ writel(0, target);
+ return;
+ }
+
+ /* Sanity check on message data */
+ WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+ /* Compute target MSI address */
+ msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+ tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+ /* Compute target HART Base PPN */
+ tbppn = tppn;
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+ WARN_ON(tbppn != mc->base_ppn);
+
+ /* Compute target group and hart indexes */
+ group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+ APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+ hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+ APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+ hart_index |= (group_index << mc->lhxw);
+ WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+ /* Compute target guest index */
+ guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+ /* Update IRQ TARGET register */
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+ << APLIC_TARGET_HART_IDX_SHIFT;
+ val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+ << APLIC_TARGET_GUEST_IDX_SHIFT;
+ val |= (msg->data & APLIC_TARGET_EIID_MASK);
+ writel(val, target);
+}
+
+static int aplic_setup_msi(struct aplic_priv *priv)
+{
+ struct device *dev = priv->dev;
+ struct aplic_msicfg *mc = &priv->msicfg;
+ const struct imsic_global_config *imsic_global;
+
+ /*
+ * The APLIC outgoing MSI config registers assume target MSI
+ * controller to be RISC-V AIA IMSIC controller.
+ */
+ imsic_global = imsic_get_global_config();
+ if (!imsic_global) {
+ dev_err(dev, "IMSIC global config not found\n");
+ return -ENODEV;
+ }
+
+ /* Find number of guest index bits (LHXS) */
+ mc->lhxs = imsic_global->guest_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+ dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
+ return -EINVAL;
+ }
+
+ /* Find number of HART index bits (LHXW) */
+ mc->lhxw = imsic_global->hart_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+ dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find number of group index bits (HHXW) */
+ mc->hhxw = imsic_global->group_index_bits;
+ if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+ dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find first bit position of group index (HHXS) */
+ mc->hhxs = imsic_global->group_index_shift;
+ if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+ dev_err(dev, "IMSIC group index shift should be >= %d\n",
+ (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+ return -EINVAL;
+ }
+ mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+ if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+ dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
+ return -EINVAL;
+ }
+
+ /* Compute PPN base */
+ mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+ /* Use all possible CPUs as lmask */
+ cpumask_copy(&priv->lmask, cpu_possible_mask);
+
+ return 0;
+}
+
+/*
+ * To handle an APLIC IDC interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_idc_handle_irq(struct irq_desc *desc)
+{
+ struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ irq_hw_number_t hw_irq;
+ int irq;
+
+ chained_irq_enter(chip, desc);
+
+ while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+ hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+ irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
+
+ if (unlikely(irq <= 0))
+ pr_warn_ratelimited("hw_irq %lu mapping not found\n",
+ hw_irq);
+ else
+ generic_handle_irq(irq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+ u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+ u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+ /* Priority must be less than threshold for interrupt triggering */
+ writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+ /* Delivery must be set to 1 for interrupt triggering */
+ writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_idc_dying_cpu(unsigned int cpu)
+{
+ if (aplic_idc_parent_irq)
+ disable_percpu_irq(aplic_idc_parent_irq);
+
+ return 0;
+}
+
+static int aplic_idc_starting_cpu(unsigned int cpu)
+{
+ if (aplic_idc_parent_irq)
+ enable_percpu_irq(aplic_idc_parent_irq,
+ irq_get_trigger_type(aplic_idc_parent_irq));
+
+ return 0;
+}
+
+static int aplic_setup_idc(struct aplic_priv *priv)
+{
+ int i, j, rc, cpu, setup_count = 0;
+ struct device_node *node = priv->dev->of_node;
+ struct device *dev = priv->dev;
+ struct of_phandle_args parent;
+ struct irq_domain *domain;
+ unsigned long hartid;
+ struct aplic_idc *idc;
+ u32 val;
+
+ /* Setup per-CPU IDC and target CPU mask */
+ for (i = 0; i < priv->nr_idcs; i++) {
+ if (of_irq_parse_one(node, i, &parent)) {
+ dev_err(dev, "failed to parse parent for IDC%d.\n",
+ i);
+ return -EIO;
+ }
+
+ /* Skip IDCs which do not connect to external interrupts */
+ if (parent.args[0] != RV_IRQ_EXT)
+ continue;
+
+ rc = riscv_of_parent_hartid(parent.np, &hartid);
+ if (rc) {
+ dev_err(dev, "failed to parse hart ID for IDC%d.\n",
+ i);
+ return rc;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ dev_warn(dev, "invalid cpuid for IDC%d\n", i);
+ continue;
+ }
+
+ cpumask_set_cpu(cpu, &priv->lmask);
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ WARN_ON(idc->priv);
+
+ idc->hart_index = i;
+ idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+ idc->priv = priv;
+
+ aplic_idc_set_delivery(idc, true);
+
+ /*
+ * Boot cpu might not have APLIC hart_index = 0 so check
+ * and update target registers of all interrupts.
+ */
+ if (cpu == smp_processor_id() && idc->hart_index) {
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ for (j = 1; j <= priv->nr_irqs; j++)
+ writel(val, priv->regs + APLIC_TARGET_BASE +
+ (j - 1) * sizeof(u32));
+ }
+
+ setup_count++;
+ }
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!aplic_idc_parent_irq && domain) {
+ aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (aplic_idc_parent_irq) {
+ irq_set_chained_handler(aplic_idc_parent_irq,
+ aplic_idc_handle_irq);
+
+ /*
+ * Setup CPUHP notifier to enable IDC parent
+ * interrupt on all CPUs
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/aplic:starting",
+ aplic_idc_starting_cpu,
+ aplic_idc_dying_cpu);
+ }
+ }
+
+ /* Fail if we were not able to setup IDC for any CPU */
+ return (setup_count) ? 0 : -ENODEV;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+ struct device_node *node = pdev->dev.of_node;
+ struct device *dev = &pdev->dev;
+ struct aplic_priv *priv;
+ struct resource *regs;
+ phys_addr_t pa;
+ int rc;
+
+ regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!regs) {
+ dev_err(dev, "cannot find registers resource\n");
+ return -ENOENT;
+ }
+
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+ platform_set_drvdata(pdev, priv);
+ priv->dev = dev;
+
+ priv->regs = devm_ioremap(dev, regs->start, resource_size(regs));
+ if (WARN_ON(!priv->regs)) {
+ dev_err(dev, "failed ioremap registers\n");
+ return -EIO;
+ }
+
+ of_property_read_u32(node, "riscv,num-sources", &priv->nr_irqs);
+ if (!priv->nr_irqs) {
+ dev_err(dev, "failed to get number of interrupt sources\n");
+ return -EINVAL;
+ }
+
+ /* Setup initial state APLIC interrupts */
+ aplic_init_hw_irqs(priv);
+
+ /*
+ * Setup IDCs or MSIs based on parent interrupts in DT node
+ *
+ * If "msi-parent" DT property is present then we ignore the
+ * APLIC IDCs which forces the APLIC driver to use MSI mode.
+ */
+ priv->nr_idcs = of_property_read_bool(node, "msi-parent") ?
+ 0 : of_irq_count(node);
+ if (priv->nr_idcs)
+ rc = aplic_setup_idc(priv);
+ else
+ rc = aplic_setup_msi(priv);
+ if (rc)
+ return rc;
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv);
+
+ /* Create irq domain instance for the APLIC */
+ if (priv->nr_idcs)
+ priv->irqdomain = irq_domain_create_linear(
+ of_node_to_fwnode(node),
+ priv->nr_irqs + 1,
+ &aplic_irqdomain_idc_ops,
+ priv);
+ else
+ priv->irqdomain = platform_msi_create_device_domain(dev,
+ priv->nr_irqs + 1,
+ aplic_msi_write_msg,
+ &aplic_irqdomain_msi_ops,
+ priv);
+ if (!priv->irqdomain) {
+ dev_err(dev, "failed to add irq domain\n");
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ if (priv->nr_idcs) {
+ dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
+ priv->nr_irqs, priv->nr_idcs);
+ } else {
+ pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+ dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
+ priv->nr_irqs, &pa);
+ }
+
+ return 0;
+}
+
+static int aplic_remove(struct platform_device *pdev)
+{
+ struct aplic_priv *priv = platform_get_drvdata(pdev);
+
+ irq_domain_remove(priv->irqdomain);
+
+ return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+ { .compatible = "riscv,aplic" },
+ {}
+};
+
+static struct platform_driver aplic_driver = {
+ .driver = {
+ .name = "riscv-aplic",
+ .of_match_table = aplic_match,
+ },
+ .probe = aplic_probe,
+ .remove = aplic_remove,
+};
+
+static int __init aplic_init(void)
+{
+ return platform_driver_register(&aplic_driver);
+}
+core_initcall(aplic_init);
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..88177eefd411
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC BIT(14)
+#define APLIC_MAX_SOURCE 1024
+
+#define APLIC_DOMAINCFG 0x0000
+#define APLIC_DOMAINCFG_RDONLY 0x80000000
+#define APLIC_DOMAINCFG_IE BIT(8)
+#define APLIC_DOMAINCFG_DM BIT(2)
+#define APLIC_DOMAINCFG_BE BIT(0)
+
+#define APLIC_SOURCECFG_BASE 0x0004
+#define APLIC_SOURCECFG_D BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
+#define APLIC_SOURCECFG_SM_MASK 0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE 0x0
+#define APLIC_SOURCECFG_SM_DETACH 0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
+
+#define APLIC_MMSICFGADDR 0x1bc0
+#define APLIC_MMSICFGADDRH 0x1bc4
+#define APLIC_SMSICFGADDR 0x1bc8
+#define APLIC_SMSICFGADDRH 0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
+#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
+#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
+#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT 12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+ (BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+ (BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+ ((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+ (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+ APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+ (BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+ ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+ (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+ APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_SETIP_BASE 0x1c00
+#define APLIC_SETIPNUM 0x1cdc
+
+#define APLIC_CLRIP_BASE 0x1d00
+#define APLIC_CLRIPNUM 0x1ddc
+
+#define APLIC_SETIE_BASE 0x1e00
+#define APLIC_SETIENUM 0x1edc
+
+#define APLIC_CLRIE_BASE 0x1f00
+#define APLIC_CLRIENUM 0x1fdc
+
+#define APLIC_SETIPNUM_LE 0x2000
+#define APLIC_SETIPNUM_BE 0x2004
+
+#define APLIC_GENMSI 0x3000
+
+#define APLIC_TARGET_BASE 0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT 18
+#define APLIC_TARGET_HART_IDX_MASK 0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT 12
+#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
+#define APLIC_TARGET_IPRIO_MASK 0xff
+#define APLIC_TARGET_EIID_MASK 0x7ff
+
+#define APLIC_IDC_BASE 0x4000
+#define APLIC_IDC_SIZE 32
+
+#define APLIC_IDC_IDELIVERY 0x00
+
+#define APLIC_IDC_IFORCE 0x04
+
+#define APLIC_IDC_ITHRESHOLD 0x08
+
+#define APLIC_IDC_TOPI 0x18
+#define APLIC_IDC_TOPI_ID_SHIFT 16
+#define APLIC_IDC_TOPI_ID_MASK 0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK 0xff
+
+#define APLIC_IDC_CLAIMI 0x1c
+
+#endif
--
2.34.1
We add DT bindings document for RISC-V advanced platform level
interrupt controller (APLIC) defined by the RISC-V advanced
interrupt architecture (AIA) specification.
Signed-off-by: Anup Patel <[email protected]>
---
.../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
1 file changed, 159 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..b7f20aad72c2
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,159 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description:
+ The RISC-V advanced interrupt architecture (AIA) defines an advanced
+ platform level interrupt controller (APLIC) for handling wired interrupts
+ in a RISC-V platform. The RISC-V AIA specification can be found at
+ https://github.com/riscv/riscv-aia.
+
+ The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+ interrupt sources connect to the root domain which can further delegate
+ interrupts to child domains. There is one device tree node for each APLIC
+ domain.
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - riscv,qemu-aplic
+ - const: riscv,aplic
+
+ reg:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 2
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Given APLIC domain directly injects external interrupts to a set of
+ RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+ node, which has a riscv node (i.e. RISC-V HART) as parent.
+
+ msi-parent:
+ description:
+ Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+ message signaled interrupt controller (IMSIC). This property should be
+ considered only when the interrupts-extended property is absent.
+
+ riscv,num-sources:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 1023
+ description:
+ Specifies how many wired interrupts are supported by this APLIC domain.
+
+ riscv,children:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ maxItems: 1
+ description:
+ A list of child APLIC domains for the given APLIC domain. Each child
+ APLIC domain is assigned child index in increasing order with the
+ first child APLIC domain assigned child index 0. The APLIC domain
+ child index is used by firmware to delegate interrupts from the
+ given APLIC domain to a particular child APLIC domain.
+
+ riscv,delegate:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ items:
+ - description: child APLIC domain phandle
+ - description: first interrupt number (inclusive)
+ - description: last interrupt number (inclusive)
+ description:
+ A interrupt delegation list where each entry is a triple consisting
+ of child APLIC domain phandle, first interrupt number, and last
+ interrupt number. The firmware will configure interrupt delegation
+ registers based on interrupt delegation list.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - "#interrupt-cells"
+ - riscv,num-sources
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+ aplic0: interrupt-controller@c000000 {
+ compatible = "riscv,qemu-aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0xc000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic1>, <&aplic2>;
+ riscv,delegate = <&aplic1 1 63>;
+ };
+
+ aplic1: interrupt-controller@d000000 {
+ compatible = "riscv,qemu-aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>;
+ reg = <0xd000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ aplic2: interrupt-controller@e000000 {
+ compatible = "riscv,qemu-aplic", "riscv,aplic";
+ interrupts-extended = <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0xe000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ - |
+ // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+ aplic3: interrupt-controller@c000000 {
+ compatible = "riscv,qemu-aplic", "riscv,aplic";
+ msi-parent = <&imsic_mlevel>;
+ reg = <0xc000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic4>;
+ riscv,delegate = <&aplic4 1 63>;
+ };
+
+ aplic4: interrupt-controller@d000000 {
+ compatible = "riscv,qemu-aplic", "riscv,aplic";
+ msi-parent = <&imsic_slevel>;
+ reg = <0xd000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+...
--
2.34.1
We have two extension names for AIA ISA support: Smaia (M-mode AIA CSRs)
and Ssaia (S-mode AIA CSRs).
We extend the ISA string parsing to detect Smaia and Ssaia extensions.
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/hwcap.h | 8 ++++++++
arch/riscv/kernel/cpu.c | 2 ++
arch/riscv/kernel/cpufeature.c | 2 ++
3 files changed, 12 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 86328e3acb02..c649e85ed7bb 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,10 +59,18 @@ enum riscv_isa_ext_id {
RISCV_ISA_EXT_ZIHINTPAUSE,
RISCV_ISA_EXT_SSTC,
RISCV_ISA_EXT_SVINVAL,
+ RISCV_ISA_EXT_SSAIA,
+ RISCV_ISA_EXT_SMAIA,
RISCV_ISA_EXT_ID_MAX
};
static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
+#ifdef CONFIG_RISCV_M_MODE
+#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA
+#else
+#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
+#endif
+
/*
* This enum represents the logical ID for each RISC-V ISA extension static
* keys. We can use static key to optimize code path if some ISA extensions
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 1b9a5a66e55a..a215ec929160 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -162,6 +162,8 @@ arch_initcall(riscv_cpuinfo_init);
* extensions by an underscore.
*/
static struct riscv_isa_ext_data isa_ext_arr[] = {
+ __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 93e45560af30..3c5b51f519d5 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -228,6 +228,8 @@ void __init riscv_fill_hwcap(void)
SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE);
SET_ISA_EXT_MAP("sstc", RISCV_ISA_EXT_SSTC);
SET_ISA_EXT_MAP("svinval", RISCV_ISA_EXT_SVINVAL);
+ SET_ISA_EXT_MAP("smaia", RISCV_ISA_EXT_SMAIA);
+ SET_ISA_EXT_MAP("ssaia", RISCV_ISA_EXT_SSAIA);
}
#undef SET_ISA_EXT_MAP
}
--
2.34.1
Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.
Signed-off-by: Anup Patel <[email protected]>
---
MAINTAINERS | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 7f86d02cb427..c5b8eda0780e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17942,6 +17942,18 @@ F: drivers/perf/riscv_pmu.c
F: drivers/perf/riscv_pmu_legacy.c
F: drivers/perf/riscv_pmu_sbi.c
+RISC-V AIA DRIVERS
+M: Anup Patel <[email protected]>
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,imsic.yaml
+F: drivers/irqchip/irq-riscv-aplic.c
+F: drivers/irqchip/irq-riscv-imsic.c
+F: drivers/irqchip/irq-riscv-intc.c
+F: include/linux/irqchip/riscv-aplic.h
+F: include/linux/irqchip/riscv-imsic.h
+
RISC-V ARCHITECTURE
M: Paul Walmsley <[email protected]>
M: Palmer Dabbelt <[email protected]>
--
2.34.1
Hey Anup,
On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> 1 file changed, 159 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> + interrupts-extended:
> + minItems: 1
> + maxItems: 16384
> + description:
> + Given APLIC domain directly injects external interrupts to a set of
> + RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> + node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> + msi-parent:
> + description:
> + Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> + message signaled interrupt controller (IMSIC). This property should be
> + considered only when the interrupts-extended property is absent.
Considered by what?
On v1 you said:
<quote>
If both "interrupts-extended" and "msi-parent" are present then it means
the APLIC domain supports both MSI mode and Direct mode in HW. In this
case, the APLIC driver has to choose between MSI mode or Direct mode.
<\quote>
The description is still pretty ambiguous IMO. Perhaps incorporate
some of that expanded comment into the property description?
Say, "If both foo and bar are present, the APLIC domain has hardware
support for both MSI and direct mode. Software may then chose either
mode".
Have I misunderstood your comment on v1? It read as if having both
present indicated that both were possible & that "should be considered
only..." was more of a suggestion and a comment about the Linux driver's
behaviour.
Apologies if I have misunderstood, but I suppose if I have then the
binding's description could be improved!!
> + riscv,children:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + maxItems: 1
> + description:
> + A list of child APLIC domains for the given APLIC domain. Each child
> + APLIC domain is assigned child index in increasing order with the
btw, missing article before child (& a comma after order I think).
> + first child APLIC domain assigned child index 0. The APLIC domain
> + child index is used by firmware to delegate interrupts from the
> + given APLIC domain to a particular child APLIC domain.
> +
> + riscv,delegate:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
Is it valid to have a delegate property without children? If not, the
binding should reflect that dependency IMO.
> + items:
> + items:
> + - description: child APLIC domain phandle
> + - description: first interrupt number (inclusive)
> + - description: last interrupt number (inclusive)
> + description:
> + A interrupt delegation list where each entry is a triple consisting
> + of child APLIC domain phandle, first interrupt number, and last
> + interrupt number. The firmware will configure interrupt delegation
btw, drop the article before firmware here.
Also, "firmware will" or "firmware must"? Semantics perhaps, but they
are different!
Kinda for my own curiosity here, but do you expect these properties to
generally be dynamically filled in by the bootloader or read by the
bootloader to set up the configuration?
> + registers based on interrupt delegation list.
I'm sorry Anup, but this child versus delegate thing is still not clear
to me binding wise. See below.
> + aplic0: interrupt-controller@c000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu1_intc 11>,
> + <&cpu2_intc 11>,
> + <&cpu3_intc 11>,
> + <&cpu4_intc 11>;
> + reg = <0xc000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + riscv,children = <&aplic1>, <&aplic2>;
> + riscv,delegate = <&aplic1 1 63>;
Is aplic2 here for demonstrative purposes only, since it has not been
delegated any interrupts?
I suppose it is hardware present on the SoC that is not being used by
the current configuration?
Thanks,
Conor.
> + };
> +
> + aplic1: interrupt-controller@d000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu1_intc 9>,
> + <&cpu2_intc 9>;
> + reg = <0xd000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + };
> +
> + aplic2: interrupt-controller@e000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu3_intc 9>,
> + <&cpu4_intc 9>;
> + reg = <0xe000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + };
Hey Anup!
On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> The RISC-V AIA specification improves handling per-HART local interrupts
> in a backward compatible manner. This patch adds defines for new RISC-V
> AIA CSRs.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 92 insertions(+)
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 0e571f6483d9..4e1356bad7b2 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -73,7 +73,10 @@
> #define IRQ_S_EXT 9
> #define IRQ_VS_EXT 10
> #define IRQ_M_EXT 11
> +#define IRQ_S_GEXT 12
> #define IRQ_PMU_OVF 13
> +#define IRQ_LOCAL_MAX (IRQ_PMU_OVF + 1)
> +#define IRQ_LOCAL_MASK ((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
>
> /* Exception causes */
> #define EXC_INST_MISALIGNED 0
> @@ -156,6 +159,26 @@
> (_AC(1, UL) << IRQ_S_TIMER) | \
> (_AC(1, UL) << IRQ_S_EXT))
>
> +/* AIA CSR bits */
> +#define TOPI_IID_SHIFT 16
> +#define TOPI_IID_MASK 0xfff
> +#define TOPI_IPRIO_MASK 0xff
> +#define TOPI_IPRIO_BITS 8
> +
> +#define TOPEI_ID_SHIFT 16
> +#define TOPEI_ID_MASK 0x7ff
> +#define TOPEI_PRIO_MASK 0x7ff
> +
> +#define ISELECT_IPRIO0 0x30
> +#define ISELECT_IPRIO15 0x3f
> +#define ISELECT_MASK 0x1ff
> +
> +#define HVICTL_VTI 0x40000000
> +#define HVICTL_IID 0x0fff0000
> +#define HVICTL_IID_SHIFT 16
> +#define HVICTL_IPRIOM 0x00000100
> +#define HVICTL_IPRIO 0x000000ff
Why not name these as masks, like you did for the other masks?
Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
to be used *pre*-shift.
Some consistency in naming and function would be great.
> +/* Machine-Level High-Half CSRs (AIA) */
> +#define CSR_MIDELEGH 0x313
I feel like I could find Midelegh in an Irish dictionary lol
Anyways, I went through the CSRs and they do all seem correct.
Thanks,
Conor.
On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <[email protected]> wrote:
>
> Hey Anup!
>
> On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> > The RISC-V AIA specification improves handling per-HART local interrupts
> > in a backward compatible manner. This patch adds defines for new RISC-V
> > AIA CSRs.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > arch/riscv/include/asm/csr.h | 92 ++++++++++++++++++++++++++++++++++++
> > 1 file changed, 92 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> > index 0e571f6483d9..4e1356bad7b2 100644
> > --- a/arch/riscv/include/asm/csr.h
> > +++ b/arch/riscv/include/asm/csr.h
> > @@ -73,7 +73,10 @@
> > #define IRQ_S_EXT 9
> > #define IRQ_VS_EXT 10
> > #define IRQ_M_EXT 11
> > +#define IRQ_S_GEXT 12
> > #define IRQ_PMU_OVF 13
> > +#define IRQ_LOCAL_MAX (IRQ_PMU_OVF + 1)
> > +#define IRQ_LOCAL_MASK ((_AC(1, UL) << IRQ_LOCAL_MAX) - 1)
> >
> > /* Exception causes */
> > #define EXC_INST_MISALIGNED 0
> > @@ -156,6 +159,26 @@
> > (_AC(1, UL) << IRQ_S_TIMER) | \
> > (_AC(1, UL) << IRQ_S_EXT))
> >
> > +/* AIA CSR bits */
> > +#define TOPI_IID_SHIFT 16
> > +#define TOPI_IID_MASK 0xfff
> > +#define TOPI_IPRIO_MASK 0xff
> > +#define TOPI_IPRIO_BITS 8
> > +
> > +#define TOPEI_ID_SHIFT 16
> > +#define TOPEI_ID_MASK 0x7ff
> > +#define TOPEI_PRIO_MASK 0x7ff
> > +
> > +#define ISELECT_IPRIO0 0x30
> > +#define ISELECT_IPRIO15 0x3f
> > +#define ISELECT_MASK 0x1ff
> > +
> > +#define HVICTL_VTI 0x40000000
> > +#define HVICTL_IID 0x0fff0000
> > +#define HVICTL_IID_SHIFT 16
> > +#define HVICTL_IPRIOM 0x00000100
> > +#define HVICTL_IPRIO 0x000000ff
>
> Why not name these as masks, like you did for the other masks?
> Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> to be used *pre*-shift.
> Some consistency in naming and function would be great.
The following convention is being followed in asm/csr.h for defining
MASK of any XYZ field in ABC CSR:
1. ABC_XYZ : This name is used for MASK which is intended
to be used before SHIFT
2. ABC_XYZ_MASK: This name is used for MASK which is
intended to be used after SHIFT
The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
follows the above convention. The only outlier is HGATPx_VMID_MASK
define which I will fix in my next KVM RISC-V series.
I don't see how any of the AIA CSR defines are violating the above
convention.
The choice of ABC_XYZ versus ABC_XYZ_MASK name is upto
the developer as long as the above convention is not violated.
>
>
> > +/* Machine-Level High-Half CSRs (AIA) */
> > +#define CSR_MIDELEGH 0x313
>
> I feel like I could find Midelegh in an Irish dictionary lol
> Anyways, I went through the CSRs and they do all seem correct.
>
> Thanks,
> Conor.
>
>
Regards,
Anup
On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> 1 file changed, 159 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
> +
> +maintainers:
> + - Anup Patel <[email protected]>
> +
> +description:
> + The RISC-V advanced interrupt architecture (AIA) defines an advanced
> + platform level interrupt controller (APLIC) for handling wired interrupts
> + in a RISC-V platform. The RISC-V AIA specification can be found at
> + https://github.com/riscv/riscv-aia.
> +
> + The RISC-V APLIC is implemented as hierarchical APLIC domains where all
> + interrupt sources connect to the root domain which can further delegate
> + interrupts to child domains. There is one device tree node for each APLIC
> + domain.
> +
> +allOf:
> + - $ref: /schemas/interrupt-controller.yaml#
> +
> +properties:
> + compatible:
> + items:
> + - enum:
> + - riscv,qemu-aplic
Make 'qemu' the vendor.
> + - const: riscv,aplic
> +
> + reg:
> + maxItems: 1
> +
> + interrupt-controller: true
> +
> + "#interrupt-cells":
> + const: 2
> +
> + interrupts-extended:
> + minItems: 1
> + maxItems: 16384
> + description:
> + Given APLIC domain directly injects external interrupts to a set of
> + RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> + node, which has a riscv node (i.e. RISC-V HART) as parent.
> +
> + msi-parent:
> + description:
> + Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> + message signaled interrupt controller (IMSIC). This property should be
> + considered only when the interrupts-extended property is absent.
> +
> + riscv,num-sources:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 1
> + maximum: 1023
> + description:
> + Specifies how many wired interrupts are supported by this APLIC domain.
We don't normally need to how many interrupts, why here?
> +
> + riscv,children:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + maxItems: 1
> + description:
> + A list of child APLIC domains for the given APLIC domain. Each child
> + APLIC domain is assigned child index in increasing order with the
> + first child APLIC domain assigned child index 0. The APLIC domain
> + child index is used by firmware to delegate interrupts from the
> + given APLIC domain to a particular child APLIC domain.
> +
> + riscv,delegate:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + items:
> + - description: child APLIC domain phandle
> + - description: first interrupt number (inclusive)
> + - description: last interrupt number (inclusive)
> + description:
> + A interrupt delegation list where each entry is a triple consisting
> + of child APLIC domain phandle, first interrupt number, and last
> + interrupt number. The firmware will configure interrupt delegation
> + registers based on interrupt delegation list.
The node's domain it delegating its interrupts to the child domain or
the other way around? The interrupt numbers here are this domain's or
the child's?
> +
> +required:
> + - compatible
> + - reg
> + - interrupt-controller
> + - "#interrupt-cells"
> + - riscv,num-sources
> +
> +unevaluatedProperties: false
> +
> +examples:
> + - |
> + // Example 1 (APLIC domains directly injecting interrupt to HARTs):
> +
> + aplic0: interrupt-controller@c000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu1_intc 11>,
> + <&cpu2_intc 11>,
> + <&cpu3_intc 11>,
> + <&cpu4_intc 11>;
> + reg = <0xc000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + riscv,children = <&aplic1>, <&aplic2>;
> + riscv,delegate = <&aplic1 1 63>;
> + };
> +
> + aplic1: interrupt-controller@d000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu1_intc 9>,
> + <&cpu2_intc 9>;
> + reg = <0xd000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + };
> +
> + aplic2: interrupt-controller@e000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + interrupts-extended = <&cpu3_intc 9>,
> + <&cpu4_intc 9>;
> + reg = <0xe000000 0x4080>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + };
> +
> + - |
> + // Example 2 (APLIC domains forwarding interrupts as MSIs):
> +
> + aplic3: interrupt-controller@c000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + msi-parent = <&imsic_mlevel>;
> + reg = <0xc000000 0x4000>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + riscv,children = <&aplic4>;
> + riscv,delegate = <&aplic4 1 63>;
> + };
> +
> + aplic4: interrupt-controller@d000000 {
> + compatible = "riscv,qemu-aplic", "riscv,aplic";
> + msi-parent = <&imsic_slevel>;
> + reg = <0xd000000 0x4000>;
> + interrupt-controller;
> + #interrupt-cells = <2>;
> + riscv,num-sources = <63>;
> + };
> +...
> --
> 2.34.1
>
On Tue, 03 Jan 2023 14:14:03 +0000,
Anup Patel <[email protected]> wrote:
>
> The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> local interrupts in following ways:
> 1. Minimum 64 local interrupts for both RV32 and RV64
> 2. Ability to process multiple pending local interrupts in same
> interrupt handler
> 3. Priority configuration for each local interrupts
> 4. Special CSRs to configure/access the per-HART MSI controller
>
> This patch adds support for RISC-V AIA in the RISC-V intc driver.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/irq-riscv-intc.c | 37 ++++++++++++++++++++++++++------
> 1 file changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
> index f229e3e66387..880d1639aadc 100644
> --- a/drivers/irqchip/irq-riscv-intc.c
> +++ b/drivers/irqchip/irq-riscv-intc.c
> @@ -16,6 +16,7 @@
> #include <linux/module.h>
> #include <linux/of.h>
> #include <linux/smp.h>
> +#include <asm/hwcap.h>
>
> static struct irq_domain *intc_domain;
>
> @@ -29,6 +30,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
> generic_handle_domain_irq(intc_domain, cause);
> }
>
> +static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
What does "static asmlinkage" in a C file even mean? And clearly, this
isn't the only instance in this file...
> +{
> + unsigned long topi;
> +
> + while ((topi = csr_read(CSR_TOPI)))
> + generic_handle_domain_irq(intc_domain,
> + topi >> TOPI_IID_SHIFT);
> +}
> +
> /*
> * On RISC-V systems local interrupts are masked or unmasked by writing
> * the SIE (Supervisor Interrupt Enable) CSR. As CSRs can only be written
> @@ -38,12 +48,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
>
> static void riscv_intc_irq_mask(struct irq_data *d)
> {
> - csr_clear(CSR_IE, BIT(d->hwirq));
> + if (d->hwirq < BITS_PER_LONG)
And what if BIT_PER_LONG is 32, as I expect it to be on 32bit, which
the commit message says is supported?
> + csr_clear(CSR_IE, BIT(d->hwirq));
> + else
> + csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
> }
>
> static void riscv_intc_irq_unmask(struct irq_data *d)
> {
> - csr_set(CSR_IE, BIT(d->hwirq));
> + if (d->hwirq < BITS_PER_LONG)
> + csr_set(CSR_IE, BIT(d->hwirq));
> + else
> + csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
> }
>
> static void riscv_intc_irq_eoi(struct irq_data *d)
> @@ -115,7 +131,7 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
> static int __init riscv_intc_init(struct device_node *node,
> struct device_node *parent)
> {
> - int rc;
> + int rc, nr_irqs;
> unsigned long hartid;
>
> rc = riscv_of_parent_hartid(node, &hartid);
> @@ -133,14 +149,21 @@ static int __init riscv_intc_init(struct device_node *node,
> if (riscv_hartid_to_cpuid(hartid) != smp_processor_id())
> return 0;
>
> - intc_domain = irq_domain_add_linear(node, BITS_PER_LONG,
> + nr_irqs = BITS_PER_LONG;
> + if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
> + nr_irqs = nr_irqs * 2;
Really, please drop this BITS_PER_LONG stuff. Use explicit numbers.
M.
--
Without deviation from the norm, progress is not possible.
On Tue, 03 Jan 2023 14:14:05 +0000,
Anup Patel <[email protected]> wrote:
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new MSI controller for managing MSIs on a RISC-V platform. This new
> MSI controller is referred to as incoming message signaled interrupt
> controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> (For more details refer https://github.com/riscv/riscv-aia)
And how about IPIs, which this driver seems to be concerned about?
>
> This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Kconfig | 14 +-
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-imsic.c | 1174 +++++++++++++++++++++++++++
> include/linux/irqchip/riscv-imsic.h | 92 +++
> 4 files changed, 1280 insertions(+), 1 deletion(-)
> create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> create mode 100644 include/linux/irqchip/riscv-imsic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 9e65345ca3f6..a1315189a595 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -29,7 +29,6 @@ config ARM_GIC_V2M
>
> config GIC_NON_BANKED
> bool
> -
> config ARM_GIC_V3
> bool
> select IRQ_DOMAIN_HIERARCHY
> @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> select IRQ_DOMAIN_HIERARCHY
> select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_IMSIC
> + bool
> + depends on RISCV
> + select IRQ_DOMAIN_HIERARCHY
> + select GENERIC_MSI_IRQ_DOMAIN
> +
> +config RISCV_IMSIC_PCI
> + bool
> + depends on RISCV_IMSIC
> + depends on PCI
> + depends on PCI_MSI
> + default RISCV_IMSIC
This should definitely tell you that this driver needs splitting.
> +
> config EXYNOS_IRQ_COMBINER
> bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 87b49a10962c..22c723cc6ec8 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> new file mode 100644
> index 000000000000..4c16b66738d6
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic.c
> @@ -0,0 +1,1174 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#define IMSIC_DISABLE_EIDELIVERY 0
> +#define IMSIC_ENABLE_EIDELIVERY 1
> +#define IMSIC_DISABLE_EITHRESHOLD 1
> +#define IMSIC_ENABLE_EITHRESHOLD 0
> +
> +#define imsic_csr_write(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_write(CSR_IREG, __v); \
> +} while (0)
> +
> +#define imsic_csr_read(__c) \
> +({ \
> + unsigned long __v; \
> + csr_write(CSR_ISELECT, __c); \
> + __v = csr_read(CSR_IREG); \
> + __v; \
> +})
> +
> +#define imsic_csr_set(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_set(CSR_IREG, __v); \
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_clear(CSR_IREG, __v); \
> +} while (0)
> +
> +struct imsic_mmio {
> + phys_addr_t pa;
> + void __iomem *va;
> + unsigned long size;
> +};
> +
> +struct imsic_priv {
> + /* Global configuration common for all HARTs */
> + struct imsic_global_config global;
> +
> + /* MMIO regions */
> + u32 num_mmios;
> + struct imsic_mmio *mmios;
> +
> + /* Global state of interrupt identities */
> + raw_spinlock_t ids_lock;
> + unsigned long *ids_used_bimap;
> + unsigned long *ids_enabled_bimap;
> + unsigned int *ids_target_cpu;
> +
> + /* Mask for connected CPUs */
> + struct cpumask lmask;
> +
> + /* IPI interrupt identity */
> + u32 ipi_id;
> + u32 ipi_lsync_id;
> +
> + /* IRQ domains */
> + struct irq_domain *base_domain;
> + struct irq_domain *pci_domain;
> + struct irq_domain *plat_domain;
> +};
> +
> +struct imsic_handler {
> + /* Local configuration for given HART */
> + struct imsic_local_config local;
> +
> + /* Pointer to private context */
> + struct imsic_priv *priv;
> +};
> +
> +static bool imsic_init_done;
> +
> +static int imsic_parent_irq;
> +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> +
> + if (!handler || !handler->priv)
> + return NULL;
> +
> + return &handler->priv->global;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> +{
> + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> + if (!handler || !handler->priv)
> + return NULL;
How can this happen?
> +
> + return &handler->local;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_local_config);
Why are these symbols exported? They have no user, so they shouldn't
even exist here. I also seriously doubt there is a valid use case for
exposing this information to the rest of the kernel.
> +
> +static int imsic_cpu_page_phys(unsigned int cpu,
> + unsigned int guest_index,
> + phys_addr_t *out_msi_pa)
> +{
> + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> + struct imsic_global_config *global;
> + struct imsic_local_config *local;
> +
> + if (!handler || !handler->priv)
> + return -ENODEV;
> + local = &handler->local;
> + global = &handler->priv->global;
> +
> + if (BIT(global->guest_index_bits) <= guest_index)
> + return -EINVAL;
> +
> + if (out_msi_pa)
> + *out_msi_pa = local->msi_pa +
> + (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> + return 0;
> +}
> +
> +static int imsic_get_cpu(struct imsic_priv *priv,
> + const struct cpumask *mask_val, bool force,
> + unsigned int *out_target_cpu)
> +{
> + struct cpumask amask;
> + unsigned int cpu;
> +
> + cpumask_and(&amask, &priv->lmask, mask_val);
> +
> + if (force)
> + cpu = cpumask_first(&amask);
> + else
> + cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> +
> + if (out_target_cpu)
> + *out_target_cpu = cpu;
> +
> + return 0;
> +}
> +
> +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> + struct msi_msg *msg)
> +{
> + phys_addr_t msi_addr;
> + int err;
> +
> + err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> + if (err)
> + return err;
> +
> + msg->address_hi = upper_32_bits(msi_addr);
> + msg->address_lo = lower_32_bits(msi_addr);
> + msg->data = id;
> +
> + return err;
> +}
> +
> +static void imsic_id_set_target(struct imsic_priv *priv,
> + unsigned int id, unsigned int target_cpu)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + priv->ids_target_cpu[id] = target_cpu;
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> + unsigned int id)
> +{
> + unsigned int ret;
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + ret = priv->ids_target_cpu[id];
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> + return ret;
> +}
> +
> +static void __imsic_eix_update(unsigned long base_id,
> + unsigned long num_id, bool pend, bool val)
> +{
> + unsigned long i, isel, ireg, flags;
> + unsigned long id = base_id, last_id = base_id + num_id;
> +
> + while (id < last_id) {
> + isel = id / BITS_PER_LONG;
> + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> +
> + ireg = 0;
> + for (i = id & (__riscv_xlen - 1);
> + (id < last_id) && (i < __riscv_xlen); i++) {
> + ireg |= BIT(i);
> + id++;
> + }
> +
> + /*
> + * The IMSIC EIEx and EIPx registers are indirectly
> + * accessed via using ISELECT and IREG CSRs so we
> + * save/restore local IRQ to ensure that we don't
> + * get preempted while accessing IMSIC registers.
> + */
> + local_irq_save(flags);
> + if (val)
> + imsic_csr_set(isel, ireg);
> + else
> + imsic_csr_clear(isel, ireg);
> + local_irq_restore(flags);
What is the actual requirement? no preemption? or no interrupts? This
isn't the same thing. Also, a bunch of the users already disable
interrupts. Consistency wouldn't hurt here.
> + }
> +}
> +
> +#define __imsic_id_enable(__id) \
> + __imsic_eix_update((__id), 1, false, true)
> +#define __imsic_id_disable(__id) \
> + __imsic_eix_update((__id), 1, false, false)
> +
> +#ifdef CONFIG_SMP
> +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> +{
> + struct imsic_handler *handler;
> + struct cpumask amask;
> + int cpu;
> +
> + cpumask_and(&amask, &priv->lmask, cpu_online_mask);
Can't this race against a CPU going down?
> + for_each_cpu(cpu, &amask) {
> + if (cpu == smp_processor_id())
> + continue;
> +
> + handler = per_cpu_ptr(&imsic_handlers, cpu);
> + if (!handler || !handler->priv || !handler->local.msi_va) {
> + pr_warn("CPU%d: handler not initialized\n", cpu);
How many times are you going to do that? On each failing synchronisation?
> + continue;
> + }
> +
> + writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
As I understand it, this is a "behind the scenes" IPI. Why isn't that
a *real* IPI?
> + }
> +}
> +#else
> +#define __imsic_id_smp_sync(__priv)
> +#endif
> +
> +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + bitmap_set(priv->ids_enabled_bimap, id, 1);
> + __imsic_id_enable(id);
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> + __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + bitmap_clear(priv->ids_enabled_bimap, id, 1);
> + __imsic_id_disable(id);
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> + __imsic_id_smp_sync(priv);
> +}
> +
> +static void imsic_ids_local_sync(struct imsic_priv *priv)
> +{
> + int i;
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + for (i = 1; i <= priv->global.nr_ids; i++) {
> + if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> + continue;
> +
> + if (test_bit(i, priv->ids_enabled_bimap))
> + __imsic_id_enable(i);
> + else
> + __imsic_id_disable(i);
> + }
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> +{
> + if (enable) {
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> + } else {
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> + }
> +}
> +
> +static int imsic_ids_alloc(struct imsic_priv *priv,
> + unsigned int max_id, unsigned int order)
> +{
> + int ret;
> + unsigned long flags;
> +
> + if ((priv->global.nr_ids < max_id) ||
> + (max_id < BIT(order)))
> + return -EINVAL;
Why do we need this check? Shouldn't that be guaranteed by
construction?
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + ret = bitmap_find_free_region(priv->ids_used_bimap,
> + max_id + 1, order);
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +
> + return ret;
> +}
> +
> +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> + unsigned int order)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> + bitmap_release_region(priv->ids_used_bimap, base_id, order);
> + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> +}
> +
> +static int __init imsic_ids_init(struct imsic_priv *priv)
> +{
> + int i;
> + struct imsic_global_config *global = &priv->global;
> +
> + raw_spin_lock_init(&priv->ids_lock);
> +
> + /* Allocate used bitmap */
> + priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> + sizeof(unsigned long), GFP_KERNEL);
How about bitmap_alloc?
> + if (!priv->ids_used_bimap)
> + return -ENOMEM;
> +
> + /* Allocate enabled bitmap */
> + priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> + sizeof(unsigned long), GFP_KERNEL);
> + if (!priv->ids_enabled_bimap) {
> + kfree(priv->ids_used_bimap);
> + return -ENOMEM;
> + }
> +
> + /* Allocate target CPU array */
> + priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> + sizeof(unsigned int), GFP_KERNEL);
> + if (!priv->ids_target_cpu) {
> + kfree(priv->ids_enabled_bimap);
> + kfree(priv->ids_used_bimap);
> + return -ENOMEM;
> + }
> + for (i = 0; i <= global->nr_ids; i++)
> + priv->ids_target_cpu[i] = UINT_MAX;
> +
> + /* Reserve ID#0 because it is special and never implemented */
> + bitmap_set(priv->ids_used_bimap, 0, 1);
> +
> + return 0;
> +}
> +
> +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> +{
> + kfree(priv->ids_target_cpu);
> + kfree(priv->ids_enabled_bimap);
> + kfree(priv->ids_used_bimap);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> +
> + if (!handler || !handler->priv || !handler->local.msi_va) {
> + pr_warn("CPU%d: handler not initialized\n", cpu);
> + return;
> + }
> +
> + writel(handler->priv->ipi_id, handler->local.msi_va);
> +}
> +
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> + __imsic_id_enable(priv->ipi_id);
> + __imsic_id_enable(priv->ipi_lsync_id);
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> + int virq;
> +
> + /* Allocate interrupt identity for IPIs */
> + virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> + if (virq < 0)
> + return virq;
> + priv->ipi_id = virq;
> +
> + /* Create IMSIC IPI multiplexing */
> + virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
that says 8, and document *why* this is 8! You're not defining a type
system, you're writing a irqchip driver.
> + if (virq <= 0) {
> + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> + return (virq < 0) ? virq : -ENOMEM;
> + }
> +
> + /* Set vIRQ range */
> + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> +
> + /* Allocate interrupt identity for local enable/disable sync */
> + virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> + if (virq < 0) {
> + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> + return virq;
> + }
> + priv->ipi_lsync_id = virq;
> +
> + return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> + imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> + if (priv->ipi_id)
> + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> +}
> +#else
> +static void imsic_ipi_enable(struct imsic_priv *priv)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> +{
> + /* Clear the IPI ids because we are not using IPIs */
> + priv->ipi_id = 0;
> + priv->ipi_lsync_id = 0;
> + return 0;
> +}
> +
> +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> +{
> +}
> +#endif
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> + imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> + imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> +}
> +
> +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> + struct msi_msg *msg)
> +{
> + struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> + unsigned int cpu;
> + int err;
> +
> + cpu = imsic_id_get_target(priv, d->hwirq);
> + WARN_ON(cpu == UINT_MAX);
> +
> + err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> + WARN_ON(err);
> +
> + iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static int imsic_irq_set_affinity(struct irq_data *d,
> + const struct cpumask *mask_val,
> + bool force)
> +{
> + struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> + unsigned int target_cpu;
> + int rc;
> +
> + rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> + if (rc)
> + return rc;
> +
> + imsic_id_set_target(priv, d->hwirq, target_cpu);
> + irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> +
> + return IRQ_SET_MASK_OK;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> + .name = "RISC-V IMSIC-BASE",
> + .irq_mask = imsic_irq_mask,
> + .irq_unmask = imsic_irq_unmask,
> +#ifdef CONFIG_SMP
> + .irq_set_affinity = imsic_irq_set_affinity,
> +#endif
> + .irq_compose_msi_msg = imsic_irq_compose_msi_msg,
> + .flags = IRQCHIP_SKIP_SET_WAKE |
> + IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> + unsigned int virq,
> + unsigned int nr_irqs,
> + void *args)
> +{
> + struct imsic_priv *priv = domain->host_data;
> + msi_alloc_info_t *info = args;
> + phys_addr_t msi_addr;
> + int i, hwirq, err = 0;
> + unsigned int cpu;
> +
> + err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> + if (err)
> + return err;
> +
> + err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> + if (err)
> + return err;
> +
> + hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> + get_count_order(nr_irqs));
> + if (hwirq < 0)
> + return hwirq;
> +
> + err = iommu_dma_prepare_msi(info->desc, msi_addr);
> + if (err)
> + goto fail;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + imsic_id_set_target(priv, hwirq + i, cpu);
> + irq_domain_set_info(domain, virq + i, hwirq + i,
> + &imsic_irq_base_chip, priv,
> + handle_simple_irq, NULL, NULL);
> + irq_set_noprobe(virq + i);
> + irq_set_affinity(virq + i, &priv->lmask);
> + }
> +
> + return 0;
> +
> +fail:
> + imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> + return err;
> +}
> +
> +static void imsic_irq_domain_free(struct irq_domain *domain,
> + unsigned int virq,
> + unsigned int nr_irqs)
> +{
> + struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> + struct imsic_priv *priv = domain->host_data;
> +
> + imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> + irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops imsic_base_domain_ops = {
> + .alloc = imsic_irq_domain_alloc,
> + .free = imsic_irq_domain_free,
> +};
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> + pci_msi_mask_irq(d);
> + irq_chip_mask_parent(d);
> +}
> +
> +static void imsic_pci_unmask_irq(struct irq_data *d)
> +{
> + pci_msi_unmask_irq(d);
> + irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip imsic_pci_irq_chip = {
> + .name = "RISC-V IMSIC-PCI",
> + .irq_mask = imsic_pci_mask_irq,
> + .irq_unmask = imsic_pci_unmask_irq,
> + .irq_eoi = irq_chip_eoi_parent,
> +};
> +
> +static struct msi_domain_ops imsic_pci_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_pci_domain_info = {
> + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> + MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> + .ops = &imsic_pci_domain_ops,
> + .chip = &imsic_pci_irq_chip,
> +};
> +
> +#endif
> +
> +static struct irq_chip imsic_plat_irq_chip = {
> + .name = "RISC-V IMSIC-PLAT",
> +};
> +
> +static struct msi_domain_ops imsic_plat_domain_ops = {
> +};
> +
> +static struct msi_domain_info imsic_plat_domain_info = {
> + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> + .ops = &imsic_plat_domain_ops,
> + .chip = &imsic_plat_irq_chip,
> +};
> +
> +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> + struct fwnode_handle *fwnode)
> +{
> + /* Create Base IRQ domain */
> + priv->base_domain = irq_domain_create_tree(fwnode,
> + &imsic_base_domain_ops, priv);
> + if (!priv->base_domain) {
> + pr_err("Failed to create IMSIC base domain\n");
> + return -ENOMEM;
> + }
> + irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> +
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> + /* Create PCI MSI domain */
> + priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> + &imsic_pci_domain_info,
> + priv->base_domain);
> + if (!priv->pci_domain) {
> + pr_err("Failed to create IMSIC PCI domain\n");
> + irq_domain_remove(priv->base_domain);
> + return -ENOMEM;
> + }
> +#endif
> +
> + /* Create Platform MSI domain */
> + priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> + &imsic_plat_domain_info,
> + priv->base_domain);
> + if (!priv->plat_domain) {
> + pr_err("Failed to create IMSIC platform domain\n");
> + if (priv->pci_domain)
> + irq_domain_remove(priv->pci_domain);
> + irq_domain_remove(priv->base_domain);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + struct imsic_priv *priv = handler->priv;
> + irq_hw_number_t hwirq;
> + int err;
> +
> + WARN_ON_ONCE(!handler->priv);
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> + hwirq = hwirq >> TOPEI_ID_SHIFT;
> +
> + if (hwirq == priv->ipi_id) {
> +#ifdef CONFIG_SMP
> + ipi_mux_process();
> +#endif
> + continue;
> + } else if (hwirq == priv->ipi_lsync_id) {
> + imsic_ids_local_sync(priv);
> + continue;
> + }
> +
> + err = generic_handle_domain_irq(priv->base_domain, hwirq);
> + if (unlikely(err))
> + pr_warn_ratelimited(
> + "hwirq %lu mapping not found\n", hwirq);
> + }
> +
> + chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> + struct imsic_priv *priv = handler->priv;
> +
> + /* Enable per-CPU parent interrupt */
> + if (imsic_parent_irq)
> + enable_percpu_irq(imsic_parent_irq,
> + irq_get_trigger_type(imsic_parent_irq));
Shouldn't that be the default already?
> + else
> + pr_warn("cpu%d: parent irq not available\n", cpu);
And yet continue in sequence? Duh...
> +
> + /* Enable IPIs */
> + imsic_ipi_enable(priv);
> +
> + /*
> + * Interrupts identities might have been enabled/disabled while
> + * this CPU was not running so sync-up local enable/disable state.
> + */
> + imsic_ids_local_sync(priv);
> +
> + /* Locally enable interrupt delivery */
> + imsic_ids_local_delivery(priv, true);
> +
> + return 0;
> +}
> +
> +struct imsic_fwnode_ops {
> + u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> + void *fwopaque);
> + int (*parent_hartid)(struct fwnode_handle *fwnode,
> + void *fwopaque, u32 index,
> + unsigned long *out_hartid);
> + u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> + int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> + void *fwopaque, u32 index,
> + struct resource *res);
> + void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> + void *fwopaque, u32 index);
> + int (*read_u32)(struct fwnode_handle *fwnode,
> + void *fwopaque, const char *prop, u32 *out_val);
> + bool (*read_bool)(struct fwnode_handle *fwnode,
> + void *fwopaque, const char *prop);
> +};
Why do we need this sort of (terrible) indirection?
> +
> +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> + struct fwnode_handle *fwnode,
> + void *fwopaque)
> +{
> + struct resource res;
> + phys_addr_t base_addr;
> + int rc, nr_parent_irqs;
> + struct imsic_mmio *mmio;
> + struct imsic_priv *priv;
> + struct irq_domain *domain;
> + struct imsic_handler *handler;
> + struct imsic_global_config *global;
> + u32 i, tmp, nr_handlers = 0;
> +
> + if (imsic_init_done) {
> + pr_err("%pfwP: already initialized hence ignoring\n",
> + fwnode);
> + return -ENODEV;
> + }
> +
> + if (!riscv_isa_extension_available(NULL, SxAIA)) {
> + pr_err("%pfwP: AIA support not available\n", fwnode);
> + return -ENODEV;
> + }
> +
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> + global = &priv->global;
> +
> + /* Find number of parent interrupts */
> + nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> + if (!nr_parent_irqs) {
> + pr_err("%pfwP: no parent irqs available\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of guest index bits in MSI address */
> + rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> + &global->guest_index_bits);
> + if (rc)
> + global->guest_index_bits = 0;
> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> + if (tmp < global->guest_index_bits) {
> + pr_err("%pfwP: guest index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of HART index bits */
> + rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> + &global->hart_index_bits);
> + if (rc) {
> + /* Assume default value */
> + global->hart_index_bits = __fls(nr_parent_irqs);
> + if (BIT(global->hart_index_bits) < nr_parent_irqs)
> + global->hart_index_bits++;
> + }
> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> + global->guest_index_bits;
> + if (tmp < global->hart_index_bits) {
> + pr_err("%pfwP: HART index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of group index bits */
> + rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> + &global->group_index_bits);
> + if (rc)
> + global->group_index_bits = 0;
> + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> + global->guest_index_bits - global->hart_index_bits;
> + if (tmp < global->group_index_bits) {
> + pr_err("%pfwP: group index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /*
> + * Find first bit position of group index.
> + * If not specified assumed the default APLIC-IMSIC configuration.
> + */
> + rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> + &global->group_index_shift);
> + if (rc)
> + global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> + tmp = global->group_index_bits + global->group_index_shift - 1;
> + if (tmp >= BITS_PER_LONG) {
> + pr_err("%pfwP: group index shift too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of interrupt identities */
> + rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> + &global->nr_ids);
> + if (rc) {
> + pr_err("%pfwP: number of interrupt identities not found\n",
> + fwnode);
> + return rc;
> + }
> + if ((global->nr_ids < IMSIC_MIN_ID) ||
> + (global->nr_ids >= IMSIC_MAX_ID) ||
> + ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> + pr_err("%pfwP: invalid number of interrupt identities\n",
> + fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of guest interrupt identities */
> + if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> + &global->nr_guest_ids))
> + global->nr_guest_ids = global->nr_ids;
> + if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> + (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> + ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> + pr_err("%pfwP: invalid number of guest interrupt identities\n",
> + fwnode);
> + return -EINVAL;
> + }
Please split the whole guest stuff out. It is totally unused!
I've stopped reading. This needs structure, cleanups and a bit of
taste. Not a lot of that here at the moment.
M.
--
Without deviation from the norm, progress is not possible.
Hey Anup,
I thought I had already replied here but clearly not, sorry!
On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <[email protected]> wrote:
> > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
> > > +/* AIA CSR bits */
> > > +#define TOPI_IID_SHIFT 16
> > > +#define TOPI_IID_MASK 0xfff
While I think of it, it'd be worth noting that these are generic across
all of topi, mtopi etc. Initially I thought that this mask was wrong as
the topi section says:
bits 25:16 Interrupt identity (source number)
bits 7:0 Interrupt priority
> > > +#define TOPI_IPRIO_MASK 0xff
> > > +#define TOPI_IPRIO_BITS 8
> > > +
> > > +#define TOPEI_ID_SHIFT 16
> > > +#define TOPEI_ID_MASK 0x7ff
> > > +#define TOPEI_PRIO_MASK 0x7ff
> > > +
> > > +#define ISELECT_IPRIO0 0x30
> > > +#define ISELECT_IPRIO15 0x3f
> > > +#define ISELECT_MASK 0x1ff
> > > +
> > > +#define HVICTL_VTI 0x40000000
> > > +#define HVICTL_IID 0x0fff0000
> > > +#define HVICTL_IID_SHIFT 16
> > > +#define HVICTL_IPRIOM 0x00000100
> > > +#define HVICTL_IPRIO 0x000000ff
> >
> > Why not name these as masks, like you did for the other masks?
> > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > to be used *pre*-shift.
> > Some consistency in naming and function would be great.
>
> The following convention is being followed in asm/csr.h for defining
> MASK of any XYZ field in ABC CSR:
> 1. ABC_XYZ : This name is used for MASK which is intended
> to be used before SHIFT
> 2. ABC_XYZ_MASK: This name is used for MASK which is
> intended to be used after SHIFT
Which makes sense in theory.
> The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> follows the above convention. The only outlier is HGATPx_VMID_MASK
> define which I will fix in my next KVM RISC-V series.
Yup, it is liable to end up like that.
> I don't see how any of the AIA CSR defines are violating the above
> convention.
What I was advocating for was picking one style and sticking to it.
These copy-paste from docs things are tedious and error prone to review,
and I don't think having multiple styles is helpful.
Tedious as it was, I did check all the numbers though, so in that
respect:
Reviewed-by: Conor Dooley <[email protected]>
Thanks,
Conor.
On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <[email protected]> wrote:
>
> Hey Anup,
>
> I thought I had already replied here but clearly not, sorry!
>
> On Mon, Jan 09, 2023 at 10:39:08AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 4:37 AM Conor Dooley <[email protected]> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:01PM +0530, Anup Patel wrote:
>
> > > > +/* AIA CSR bits */
> > > > +#define TOPI_IID_SHIFT 16
> > > > +#define TOPI_IID_MASK 0xfff
>
> While I think of it, it'd be worth noting that these are generic across
> all of topi, mtopi etc. Initially I thought that this mask was wrong as
> the topi section says:
> bits 25:16 Interrupt identity (source number)
> bits 7:0 Interrupt priority
These defines are for the AIA CSRs and not AIA APLIC IDC registers.
As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
bits: 27:16 IID
bits: 7:0 IPRIO
>
> > > > +#define TOPI_IPRIO_MASK 0xff
> > > > +#define TOPI_IPRIO_BITS 8
> > > > +
> > > > +#define TOPEI_ID_SHIFT 16
> > > > +#define TOPEI_ID_MASK 0x7ff
> > > > +#define TOPEI_PRIO_MASK 0x7ff
> > > > +
> > > > +#define ISELECT_IPRIO0 0x30
> > > > +#define ISELECT_IPRIO15 0x3f
> > > > +#define ISELECT_MASK 0x1ff
> > > > +
> > > > +#define HVICTL_VTI 0x40000000
> > > > +#define HVICTL_IID 0x0fff0000
> > > > +#define HVICTL_IID_SHIFT 16
> > > > +#define HVICTL_IPRIOM 0x00000100
> > > > +#define HVICTL_IPRIO 0x000000ff
> > >
> > > Why not name these as masks, like you did for the other masks?
> > > Also, the mask/shift defines appear inconsistent. TOPI_IID_MASK is
> > > intended to be used post-shift AFAICT, but HVICTL_IID_SHIFT is intended
> > > to be used *pre*-shift.
> > > Some consistency in naming and function would be great.
> >
> > The following convention is being followed in asm/csr.h for defining
> > MASK of any XYZ field in ABC CSR:
> > 1. ABC_XYZ : This name is used for MASK which is intended
> > to be used before SHIFT
> > 2. ABC_XYZ_MASK: This name is used for MASK which is
> > intended to be used after SHIFT
>
> Which makes sense in theory.
>
> > The existing defines for [M|S]STATUS, HSTATUS, SATP, and xENVCFG
> > follows the above convention. The only outlier is HGATPx_VMID_MASK
> > define which I will fix in my next KVM RISC-V series.
>
> Yup, it is liable to end up like that.
>
> > I don't see how any of the AIA CSR defines are violating the above
> > convention.
>
> What I was advocating for was picking one style and sticking to it.
> These copy-paste from docs things are tedious and error prone to review,
> and I don't think having multiple styles is helpful.
On the other hand, I think we should let developers choose a style
which is better suited for a particular register field instead enforcing
it here. The best we can do is follow a naming convention for defines.
>
> Tedious as it was, I did check all the numbers though, so in that
> respect:
> Reviewed-by: Conor Dooley <[email protected]>
BTW, this patch is shared with KVM AIA CSR series so most likely
I will take this patch through that series.
Regards,
Anup
On Fri, Jan 27, 2023 at 05:28:57PM +0530, Anup Patel wrote:
> On Wed, Jan 18, 2023 at 2:12 AM Conor Dooley <[email protected]> wrote:
> > > > > +/* AIA CSR bits */
> > > > > +#define TOPI_IID_SHIFT 16
> > > > > +#define TOPI_IID_MASK 0xfff
> >
> > While I think of it, it'd be worth noting that these are generic across
> > all of topi, mtopi etc. Initially I thought that this mask was wrong as
> > the topi section says:
> > bits 25:16 Interrupt identity (source number)
> > bits 7:0 Interrupt priority
>
> These defines are for the AIA CSRs and not AIA APLIC IDC registers.
>
> As per the latest frozen spec, the mtopi/stopi/vstopi has following bits:
> bits: 27:16 IID
> bits: 7:0 IPRIO
I know, that those ones use those bits, hence leaving an R-b for the
patch - but your define says TOPI, which it is *not* accurate for.
That is confusing and should be noted.
> > What I was advocating for was picking one style and sticking to it.
> > These copy-paste from docs things are tedious and error prone to review,
> > and I don't think having multiple styles is helpful.
>
> On the other hand, I think we should let developers choose a style
> which is better suited for a particular register field instead enforcing
> it here. The best we can do is follow a naming convention for defines.
Well shall have to agree to disagree I suppose!
> > Tedious as it was, I did check all the numbers though, so in that
> > respect:
> > Reviewed-by: Conor Dooley <[email protected]>
>
> BTW, this patch is shared with KVM AIA CSR series so most likely
> I will take this patch through that series.
Since the path which it gets applied is between you and Palmer to
decide, feel free to add the R-b whichever way the patch ends up going!
Thanks,
Conor.
On 1/3/23 22:14, Anup Patel wrote:
> We add DT bindings document for RISC-V advanced platform level
> interrupt controller (APLIC) defined by the RISC-V advanced
> interrupt architecture (AIA) specification.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> 1 file changed, 159 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> new file mode 100644
> index 000000000000..b7f20aad72c2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> @@ -0,0 +1,159 @@
>
> <snip>
>
> + riscv,children:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + maxItems: 1
> + description:
> + A list of child APLIC domains for the given APLIC domain. Each child
> + APLIC domain is assigned child index in increasing order with the
> + first child APLIC domain assigned child index 0. The APLIC domain
> + child index is used by firmware to delegate interrupts from the
> + given APLIC domain to a particular child APLIC domain.
> +
> + riscv,delegate:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + items:
> + - description: child APLIC domain phandle
> + - description: first interrupt number (inclusive)
> + - description: last interrupt number (inclusive)
> + description:
> + A interrupt delegation list where each entry is a triple consisting
> + of child APLIC domain phandle, first interrupt number, and last
> + interrupt number. The firmware will configure interrupt delegation
> + registers based on interrupt delegation list.
> +
I'm not sure if this is the right place to ask, since it could be more
of a OpenSBI/QEMU problem, but I think a more detailed description about
what 'the firmware' does is appropriate here.
My main confusion is how to describe wired interrupts connected to
APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
interrupts are connected directly to aplic_m. So how do I refer to it in
the device nodes?
1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
software, which could be a problem. QEMU 7.2.0 seems to take this
approach. (I could also be misunderstanding QEMU and it actually
does connect wired interrupts to the S-mode APLIC, but then
riscv,children and riscv,delegate would be lies.)
2. <&aplic_m ...>, and when M-mode software gives S-mode software
access to devices, it delegates relevant interrupts and patches it
into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
approach, but pretty complicated.
3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
num in riscv,delegate, so goes to find the child it's been delegated
to, which is (should be) aplic_s. A bit annoyingly abstraction
breaking, since S-mode shouldn't even need to know about aplic_m.
I see that others are also confused by riscv,delegate and riscv,children
properties. It would be great if we could clarify the expected behavior
here rather than just saying 'the firmware will do the thing'.
> <snip>
> +...
Thanks,
Vivian
On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <[email protected]> wrote:
>
> Hey Anup,
>
> On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> > 1 file changed, 159 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> > + interrupts-extended:
> > + minItems: 1
> > + maxItems: 16384
> > + description:
> > + Given APLIC domain directly injects external interrupts to a set of
> > + RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
> > + node, which has a riscv node (i.e. RISC-V HART) as parent.
> > +
> > + msi-parent:
> > + description:
> > + Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
> > + message signaled interrupt controller (IMSIC). This property should be
> > + considered only when the interrupts-extended property is absent.
>
> Considered by what?
> On v1 you said:
> <quote>
> If both "interrupts-extended" and "msi-parent" are present then it means
> the APLIC domain supports both MSI mode and Direct mode in HW. In this
> case, the APLIC driver has to choose between MSI mode or Direct mode.
> <\quote>
>
> The description is still pretty ambiguous IMO. Perhaps incorporate
> some of that expanded comment into the property description?
> Say, "If both foo and bar are present, the APLIC domain has hardware
> support for both MSI and direct mode. Software may then chose either
> mode".
> Have I misunderstood your comment on v1? It read as if having both
> present indicated that both were possible & that "should be considered
> only..." was more of a suggestion and a comment about the Linux driver's
> behaviour.
> Apologies if I have misunderstood, but I suppose if I have then the
> binding's description could be improved!!
Yes, when both DT properties are present then it's up to Linux
APLIC driver to choose the appropriate APLIC mode.
I forgot to update the text here in v2 but I will update it in v3.
Thanks for pointing.
>
> > + riscv,children:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
> > + items:
> > + maxItems: 1
> > + description:
> > + A list of child APLIC domains for the given APLIC domain. Each child
> > + APLIC domain is assigned child index in increasing order with the
>
> btw, missing article before child (& a comma after order I think).
Okay, I will update.
>
> > + first child APLIC domain assigned child index 0. The APLIC domain
> > + child index is used by firmware to delegate interrupts from the
> > + given APLIC domain to a particular child APLIC domain.
> > +
> > + riscv,delegate:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
>
> Is it valid to have a delegate property without children? If not, the
> binding should reflect that dependency IMO.
Okay, I will update.
>
> > + items:
> > + items:
> > + - description: child APLIC domain phandle
> > + - description: first interrupt number (inclusive)
> > + - description: last interrupt number (inclusive)
> > + description:
> > + A interrupt delegation list where each entry is a triple consisting
> > + of child APLIC domain phandle, first interrupt number, and last
> > + interrupt number. The firmware will configure interrupt delegation
>
> btw, drop the article before firmware here.
> Also, "firmware will" or "firmware must"? Semantics perhaps, but they
> are different!
I think "firmware must" is better because APLIC M-mode domains are
not accessible to S-mode so firmware must configure delegation for
at least all APLIC M-mode domains.
>
> Kinda for my own curiosity here, but do you expect these properties to
> generally be dynamically filled in by the bootloader or read by the
> bootloader to set up the configuration?
Firmware (or bootloader) will look at this property and setup delegation
before booting the OS kernel.
>
> > + registers based on interrupt delegation list.
>
> I'm sorry Anup, but this child versus delegate thing is still not clear
> to me binding wise. See below.
There are two different information in-context of APLIC domain:
1) HW child domain numbering: If an APLIC domain has N children
then HW will have a fixed child index for each of the N children
in the range 0 to N-1. This HW child index is required at the time
of setting up interrupt delegation in sourcecfgX registers. The
"riscv,children" DT property helps firmware (or bootloader) find
the total number of child APLIC domains and corresponding
HW child index number.
2) IRQ delegation to child domains: An APLIC domain can delegate
any IRQ range(s) to a particular APLIC child domain. The
"riscv,delegate" DT property is simply a table where we have
one row for each IRQ range which is delegated to some child
APLIC domain. This property is more of a system setting fixed
by the RISC-V platform vendor.
>
> > + aplic0: interrupt-controller@c000000 {
> > + compatible = "riscv,qemu-aplic", "riscv,aplic";
> > + interrupts-extended = <&cpu1_intc 11>,
> > + <&cpu2_intc 11>,
> > + <&cpu3_intc 11>,
> > + <&cpu4_intc 11>;
> > + reg = <0xc000000 0x4080>;
> > + interrupt-controller;
> > + #interrupt-cells = <2>;
> > + riscv,num-sources = <63>;
> > + riscv,children = <&aplic1>, <&aplic2>;
> > + riscv,delegate = <&aplic1 1 63>;
>
> Is aplic2 here for demonstrative purposes only, since it has not been
> delegated any interrupts?
Yes, it's for demonstrative purposes only.
> I suppose it is hardware present on the SoC that is not being used by
> the current configuration?
Yes, in this example aplic2 is unused because it has no interrupts
delegated to it.
>
> Thanks,
> Conor.
>
> > + };
> > +
> > + aplic1: interrupt-controller@d000000 {
> > + compatible = "riscv,qemu-aplic", "riscv,aplic";
> > + interrupts-extended = <&cpu1_intc 9>,
> > + <&cpu2_intc 9>;
> > + reg = <0xd000000 0x4080>;
> > + interrupt-controller;
> > + #interrupt-cells = <2>;
> > + riscv,num-sources = <63>;
> > + };
> > +
> > + aplic2: interrupt-controller@e000000 {
> > + compatible = "riscv,qemu-aplic", "riscv,aplic";
> > + interrupts-extended = <&cpu3_intc 9>,
> > + <&cpu4_intc 9>;
> > + reg = <0xe000000 0x4080>;
> > + interrupt-controller;
> > + #interrupt-cells = <2>;
> > + riscv,num-sources = <63>;
> > + };
>
Regards,
Anup
On Sun, Feb 19, 2023 at 5:18 PM Vivian Wang <[email protected]> wrote:
>
> On 1/3/23 22:14, Anup Patel wrote:
> > We add DT bindings document for RISC-V advanced platform level
> > interrupt controller (APLIC) defined by the RISC-V advanced
> > interrupt architecture (AIA) specification.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> > 1 file changed, 159 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > new file mode 100644
> > index 000000000000..b7f20aad72c2
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > @@ -0,0 +1,159 @@
> >
> > <snip>
> >
> > + riscv,children:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
> > + items:
> > + maxItems: 1
> > + description:
> > + A list of child APLIC domains for the given APLIC domain. Each child
> > + APLIC domain is assigned child index in increasing order with the
> > + first child APLIC domain assigned child index 0. The APLIC domain
> > + child index is used by firmware to delegate interrupts from the
> > + given APLIC domain to a particular child APLIC domain.
> > +
> > + riscv,delegate:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
> > + items:
> > + items:
> > + - description: child APLIC domain phandle
> > + - description: first interrupt number (inclusive)
> > + - description: last interrupt number (inclusive)
> > + description:
> > + A interrupt delegation list where each entry is a triple consisting
> > + of child APLIC domain phandle, first interrupt number, and last
> > + interrupt number. The firmware will configure interrupt delegation
> > + registers based on interrupt delegation list.
> > +
>
> I'm not sure if this is the right place to ask, since it could be more
> of a OpenSBI/QEMU problem, but I think a more detailed description about
> what 'the firmware' does is appropriate here.
>
> My main confusion is how to describe wired interrupts connected to
> APLICs. Say we have two APLIC nodes with labels aplic_m and aplic_s that
> are the APLIC domains for M-mode and S-mode respectively. IIUC, wired
> interrupts are connected directly to aplic_m. So how do I refer to it in
> the device nodes?
Please see my previous reply to Conor about these DT properties.
The riscv,children DT property describes HW child numbering whereas
the riscv,delegate DT propert is a table of IRQ delegation.
In your example, let's assume we have N wired interrupts. This
means we will have devices connected to the root APLIC domain
(aplic_m). Now since aplic_s is a child of aplic_m, we will have
N wired interrupts going from from aplic_m to aplic_s where
aplic_m will route a wired/device interrupt x to aplic_s if
sourcecfg[x].D = 1 and sourcecfg[x].child = 0.
>
> 1. <&aplic_s num IRQ_TYPE_foo>, but it would be a lie to M-mode
> software, which could be a problem. QEMU 7.2.0 seems to take this
> approach. (I could also be misunderstanding QEMU and it actually
> does connect wired interrupts to the S-mode APLIC, but then
> riscv,children and riscv,delegate would be lies.)
No, it's not a lie. The <&aplic_s num IRQ_TYPE_foo> in a device DT
node is based on the IRQ delegation fixed by the RISC-V platform.
QEMU has its own strategy of delegating IRQs to APLIC S-mode
while other platforms can use a different strategy.
> 2. <&aplic_m ...>, and when M-mode software gives S-mode software
> access to devices, it delegates relevant interrupts and patches it
> into <&aplic_s num IRQ_TYPE_foo>. Seems to be the 'correct'
> approach, but pretty complicated.
The APLIC M-mode domain is not accessible to S-mode software so
Linux cannot create an irqdomain using APLIC M-mode DT node. This
means device DT nodes must have <&aplic_s num IRQ_TYPE_foo>
which points to APLIC S-mode domain.
It is totally up to RISC-V firmware and platform if it wants to dynamically
add/patch <&aplic_s num IRQ_TYPE_foo> in device DT nodes. Currently,
we do not patch device DT nodes in OpenSBI and instead have the
device DT nodes point to correct APLIC domain based on the IRQ
delegation.
> 3. <&aplic_m ...>, S-mode software sees this, and sees that aplic_m has
> num in riscv,delegate, so goes to find the child it's been delegated
> to, which is (should be) aplic_s. A bit annoyingly abstraction
> breaking, since S-mode shouldn't even need to know about aplic_m.
Yes, S-mode should know about aplic_m and if it tries to access aplic_m
then it will get an access fault.
This is exactly why device DT node should have "interrupts" DT property
pointing to the actual APLIC domain which is delivering interrupt to S-mode.
>
> I see that others are also confused by riscv,delegate and riscv,children
> properties. It would be great if we could clarify the expected behavior
> here rather than just saying 'the firmware will do the thing'.
Regards,
Anup
Hey Anup,
On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <[email protected]> wrote:
> > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > We add DT bindings document for RISC-V advanced platform level
> > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > interrupt architecture (AIA) specification.
> > >
> > > Signed-off-by: Anup Patel <[email protected]>
> > > ---
> > > .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> > > 1 file changed, 159 insertions(+)
> > > create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
> > I'm sorry Anup, but this child versus delegate thing is still not clear
> > to me binding wise. See below.
>
> There are two different information in-context of APLIC domain:
>
> 1) HW child domain numbering: If an APLIC domain has N children
> then HW will have a fixed child index for each of the N children
> in the range 0 to N-1. This HW child index is required at the time
> of setting up interrupt delegation in sourcecfgX registers. The
> "riscv,children" DT property helps firmware (or bootloader) find
> the total number of child APLIC domains and corresponding
> HW child index number.
>
> 2) IRQ delegation to child domains: An APLIC domain can delegate
> any IRQ range(s) to a particular APLIC child domain. The
> "riscv,delegate" DT property is simply a table where we have
> one row for each IRQ range which is delegated to some child
> APLIC domain. This property is more of a system setting fixed
> by the RISC-V platform vendor.
Thanks for the explanations. It's been a while since my brain swapped
this stuff out, but I think delegate/child makes sense to me now.
Just don't ask me to write the dt entry as proof...
Thanks,
Conor.
On Mon, Feb 20, 2023 at 10:32:57AM +0000, Conor Dooley wrote:
> On Mon, Feb 20, 2023 at 10:06:49AM +0530, Anup Patel wrote:
> > On Thu, Jan 5, 2023 at 3:47 AM Conor Dooley <[email protected]> wrote:
> > > On Tue, Jan 03, 2023 at 07:44:06PM +0530, Anup Patel wrote:
> > > > We add DT bindings document for RISC-V advanced platform level
> > > > interrupt controller (APLIC) defined by the RISC-V advanced
> > > > interrupt architecture (AIA) specification.
> > > >
> > > > Signed-off-by: Anup Patel <[email protected]>
> > > > ---
> > > > .../interrupt-controller/riscv,aplic.yaml | 159 ++++++++++++++++++
> > > > 1 file changed, 159 insertions(+)
> > > > create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
>
> > > I'm sorry Anup, but this child versus delegate thing is still not clear
> > > to me binding wise. See below.
> >
> > There are two different information in-context of APLIC domain:
> >
> > 1) HW child domain numbering: If an APLIC domain has N children
> > then HW will have a fixed child index for each of the N children
> > in the range 0 to N-1. This HW child index is required at the time
> > of setting up interrupt delegation in sourcecfgX registers. The
> > "riscv,children" DT property helps firmware (or bootloader) find
> > the total number of child APLIC domains and corresponding
> > HW child index number.
> >
> > 2) IRQ delegation to child domains: An APLIC domain can delegate
> > any IRQ range(s) to a particular APLIC child domain. The
> > "riscv,delegate" DT property is simply a table where we have
> > one row for each IRQ range which is delegated to some child
> > APLIC domain. This property is more of a system setting fixed
> > by the RISC-V platform vendor.
>
> Thanks for the explanations. It's been a while since my brain swapped
> this stuff out, but I think delegate/child makes sense to me now.
> Just don't ask me to write the dt entry as proof...
Having looked at Dramforever's QEMU dtb dump a bit more and your
responses to her, I think that I have "come to terms" with it now
actually.
I suppose when the next version comes around I'll make sure that I
arrive in the same ballpark that QEMU does, based off the descriptions
etc in the binding.
Thanks!
On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <[email protected]> wrote:
>
> On Tue, 03 Jan 2023 14:14:05 +0000,
> Anup Patel <[email protected]> wrote:
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > MSI controller is referred to as incoming message signaled interrupt
> > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > (For more details refer https://github.com/riscv/riscv-aia)
>
> And how about IPIs, which this driver seems to be concerned about?
Okay, I will mention about IPIs in the commit description.
>
> >
> > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Kconfig | 14 +-
> > drivers/irqchip/Makefile | 1 +
> > drivers/irqchip/irq-riscv-imsic.c | 1174 +++++++++++++++++++++++++++
> > include/linux/irqchip/riscv-imsic.h | 92 +++
> > 4 files changed, 1280 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> > create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index 9e65345ca3f6..a1315189a595 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> >
> > config GIC_NON_BANKED
> > bool
> > -
> > config ARM_GIC_V3
> > bool
> > select IRQ_DOMAIN_HIERARCHY
> > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> > select IRQ_DOMAIN_HIERARCHY
> > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > + bool
> > + depends on RISCV
> > + select IRQ_DOMAIN_HIERARCHY
> > + select GENERIC_MSI_IRQ_DOMAIN
> > +
> > +config RISCV_IMSIC_PCI
> > + bool
> > + depends on RISCV_IMSIC
> > + depends on PCI
> > + depends on PCI_MSI
> > + default RISCV_IMSIC
>
> This should definitely tell you that this driver needs splitting.
The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
so I felt it was too small to deserve its own source file.
>
> > +
> > config EXYNOS_IRQ_COMBINER
> > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 87b49a10962c..22c723cc6ec8 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > new file mode 100644
> > index 000000000000..4c16b66738d6
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > @@ -0,0 +1,1174 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iommu.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY 0
> > +#define IMSIC_ENABLE_EIDELIVERY 1
> > +#define IMSIC_DISABLE_EITHRESHOLD 1
> > +#define IMSIC_ENABLE_EITHRESHOLD 0
> > +
> > +#define imsic_csr_write(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_write(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c) \
> > +({ \
> > + unsigned long __v; \
> > + csr_write(CSR_ISELECT, __c); \
> > + __v = csr_read(CSR_IREG); \
> > + __v; \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_set(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_clear(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +struct imsic_mmio {
> > + phys_addr_t pa;
> > + void __iomem *va;
> > + unsigned long size;
> > +};
> > +
> > +struct imsic_priv {
> > + /* Global configuration common for all HARTs */
> > + struct imsic_global_config global;
> > +
> > + /* MMIO regions */
> > + u32 num_mmios;
> > + struct imsic_mmio *mmios;
> > +
> > + /* Global state of interrupt identities */
> > + raw_spinlock_t ids_lock;
> > + unsigned long *ids_used_bimap;
> > + unsigned long *ids_enabled_bimap;
> > + unsigned int *ids_target_cpu;
> > +
> > + /* Mask for connected CPUs */
> > + struct cpumask lmask;
> > +
> > + /* IPI interrupt identity */
> > + u32 ipi_id;
> > + u32 ipi_lsync_id;
> > +
> > + /* IRQ domains */
> > + struct irq_domain *base_domain;
> > + struct irq_domain *pci_domain;
> > + struct irq_domain *plat_domain;
> > +};
> > +
> > +struct imsic_handler {
> > + /* Local configuration for given HART */
> > + struct imsic_local_config local;
> > +
> > + /* Pointer to private context */
> > + struct imsic_priv *priv;
> > +};
> > +
> > +static bool imsic_init_done;
> > +
> > +static int imsic_parent_irq;
> > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > +
> > + if (!handler || !handler->priv)
> > + return NULL;
> > +
> > + return &handler->priv->global;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > +{
> > + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > + if (!handler || !handler->priv)
> > + return NULL;
>
> How can this happen?
These are redundant checks. I will drop.
>
> > +
> > + return &handler->local;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
>
> Why are these symbols exported? They have no user, so they shouldn't
> even exist here. I also seriously doubt there is a valid use case for
> exposing this information to the rest of the kernel.
The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
module whereas imsic_get_local_config() is only used by KVM RISC-V.
The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
patches due various interdependencies.
>
> > +
> > +static int imsic_cpu_page_phys(unsigned int cpu,
> > + unsigned int guest_index,
> > + phys_addr_t *out_msi_pa)
> > +{
> > + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > + struct imsic_global_config *global;
> > + struct imsic_local_config *local;
> > +
> > + if (!handler || !handler->priv)
> > + return -ENODEV;
> > + local = &handler->local;
> > + global = &handler->priv->global;
> > +
> > + if (BIT(global->guest_index_bits) <= guest_index)
> > + return -EINVAL;
> > +
> > + if (out_msi_pa)
> > + *out_msi_pa = local->msi_pa +
> > + (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > + return 0;
> > +}
> > +
> > +static int imsic_get_cpu(struct imsic_priv *priv,
> > + const struct cpumask *mask_val, bool force,
> > + unsigned int *out_target_cpu)
> > +{
> > + struct cpumask amask;
> > + unsigned int cpu;
> > +
> > + cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > + if (force)
> > + cpu = cpumask_first(&amask);
> > + else
> > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > + if (cpu >= nr_cpu_ids)
> > + return -EINVAL;
> > +
> > + if (out_target_cpu)
> > + *out_target_cpu = cpu;
> > +
> > + return 0;
> > +}
> > +
> > +static int imsic_get_cpu_msi_msg(unsigned int cpu, unsigned int id,
> > + struct msi_msg *msg)
> > +{
> > + phys_addr_t msi_addr;
> > + int err;
> > +
> > + err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > + if (err)
> > + return err;
> > +
> > + msg->address_hi = upper_32_bits(msi_addr);
> > + msg->address_lo = lower_32_bits(msi_addr);
> > + msg->data = id;
> > +
> > + return err;
> > +}
> > +
> > +static void imsic_id_set_target(struct imsic_priv *priv,
> > + unsigned int id, unsigned int target_cpu)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + priv->ids_target_cpu[id] = target_cpu;
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static unsigned int imsic_id_get_target(struct imsic_priv *priv,
> > + unsigned int id)
> > +{
> > + unsigned int ret;
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + ret = priv->ids_target_cpu[id];
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > + return ret;
> > +}
> > +
> > +static void __imsic_eix_update(unsigned long base_id,
> > + unsigned long num_id, bool pend, bool val)
> > +{
> > + unsigned long i, isel, ireg, flags;
> > + unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > + while (id < last_id) {
> > + isel = id / BITS_PER_LONG;
> > + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > +
> > + ireg = 0;
> > + for (i = id & (__riscv_xlen - 1);
> > + (id < last_id) && (i < __riscv_xlen); i++) {
> > + ireg |= BIT(i);
> > + id++;
> > + }
> > +
> > + /*
> > + * The IMSIC EIEx and EIPx registers are indirectly
> > + * accessed via using ISELECT and IREG CSRs so we
> > + * save/restore local IRQ to ensure that we don't
> > + * get preempted while accessing IMSIC registers.
> > + */
> > + local_irq_save(flags);
> > + if (val)
> > + imsic_csr_set(isel, ireg);
> > + else
> > + imsic_csr_clear(isel, ireg);
> > + local_irq_restore(flags);
>
> What is the actual requirement? no preemption? or no interrupts? This
> isn't the same thing. Also, a bunch of the users already disable
> interrupts. Consistency wouldn't hurt here.
No preemption is the only requirement. Since the callers of these
functions disable local IRQ, I think we don't need to not anything
special here. I will drop the local IRQ save/restore and update
the comments as well.
>
> > + }
> > +}
> > +
> > +#define __imsic_id_enable(__id) \
> > + __imsic_eix_update((__id), 1, false, true)
> > +#define __imsic_id_disable(__id) \
> > + __imsic_eix_update((__id), 1, false, false)
> > +
> > +#ifdef CONFIG_SMP
> > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > +{
> > + struct imsic_handler *handler;
> > + struct cpumask amask;
> > + int cpu;
> > +
> > + cpumask_and(&amask, &priv->lmask, cpu_online_mask);
>
> Can't this race against a CPU going down?
Yes, it can race if a CPU goes down while we are in this function
but this won't be a problem because the imsic_starting_cpu()
will unconditionally do imsic_ids_local_sync() when the CPU is
brought-up again. I will add a multiline comment block explaining
this.
>
> > + for_each_cpu(cpu, &amask) {
> > + if (cpu == smp_processor_id())
> > + continue;
> > +
> > + handler = per_cpu_ptr(&imsic_handlers, cpu);
> > + if (!handler || !handler->priv || !handler->local.msi_va) {
> > + pr_warn("CPU%d: handler not initialized\n", cpu);
>
> How many times are you going to do that? On each failing synchronisation?
My bad for adding these paranoid checks. I remove these checks
wherever possible.
>
> > + continue;
> > + }
> > +
> > + writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
>
> As I understand it, this is a "behind the scenes" IPI. Why isn't that
> a *real* IPI?
Yes, that's correct. The ID enable bits are per-CPU accessible only
via CSRs hence we have a special "behind the scenes" IPI to
synchronize state of ID enable bits.
>
> > + }
> > +}
> > +#else
> > +#define __imsic_id_smp_sync(__priv)
> > +#endif
> > +
> > +static void imsic_id_enable(struct imsic_priv *priv, unsigned int id)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + bitmap_set(priv->ids_enabled_bimap, id, 1);
> > + __imsic_id_enable(id);
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > + __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_id_disable(struct imsic_priv *priv, unsigned int id)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + bitmap_clear(priv->ids_enabled_bimap, id, 1);
> > + __imsic_id_disable(id);
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > + __imsic_id_smp_sync(priv);
> > +}
> > +
> > +static void imsic_ids_local_sync(struct imsic_priv *priv)
> > +{
> > + int i;
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + for (i = 1; i <= priv->global.nr_ids; i++) {
> > + if (priv->ipi_id == i || priv->ipi_lsync_id == i)
> > + continue;
> > +
> > + if (test_bit(i, priv->ids_enabled_bimap))
> > + __imsic_id_enable(i);
> > + else
> > + __imsic_id_disable(i);
> > + }
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static void imsic_ids_local_delivery(struct imsic_priv *priv, bool enable)
> > +{
> > + if (enable) {
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > + } else {
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > + }
> > +}
> > +
> > +static int imsic_ids_alloc(struct imsic_priv *priv,
> > + unsigned int max_id, unsigned int order)
> > +{
> > + int ret;
> > + unsigned long flags;
> > +
> > + if ((priv->global.nr_ids < max_id) ||
> > + (max_id < BIT(order)))
> > + return -EINVAL;
>
> Why do we need this check? Shouldn't that be guaranteed by
> construction?
Yes, these are redundant checks. I will remove it.
>
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + ret = bitmap_find_free_region(priv->ids_used_bimap,
> > + max_id + 1, order);
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +
> > + return ret;
> > +}
> > +
> > +static void imsic_ids_free(struct imsic_priv *priv, unsigned int base_id,
> > + unsigned int order)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&priv->ids_lock, flags);
> > + bitmap_release_region(priv->ids_used_bimap, base_id, order);
> > + raw_spin_unlock_irqrestore(&priv->ids_lock, flags);
> > +}
> > +
> > +static int __init imsic_ids_init(struct imsic_priv *priv)
> > +{
> > + int i;
> > + struct imsic_global_config *global = &priv->global;
> > +
> > + raw_spin_lock_init(&priv->ids_lock);
> > +
> > + /* Allocate used bitmap */
> > + priv->ids_used_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > + sizeof(unsigned long), GFP_KERNEL);
>
> How about bitmap_alloc?
Okay, I will use bitmap_zalloc() here.
>
> > + if (!priv->ids_used_bimap)
> > + return -ENOMEM;
> > +
> > + /* Allocate enabled bitmap */
> > + priv->ids_enabled_bimap = kcalloc(BITS_TO_LONGS(global->nr_ids + 1),
> > + sizeof(unsigned long), GFP_KERNEL);
> > + if (!priv->ids_enabled_bimap) {
> > + kfree(priv->ids_used_bimap);
> > + return -ENOMEM;
> > + }
> > +
> > + /* Allocate target CPU array */
> > + priv->ids_target_cpu = kcalloc(global->nr_ids + 1,
> > + sizeof(unsigned int), GFP_KERNEL);
> > + if (!priv->ids_target_cpu) {
> > + kfree(priv->ids_enabled_bimap);
> > + kfree(priv->ids_used_bimap);
> > + return -ENOMEM;
> > + }
> > + for (i = 0; i <= global->nr_ids; i++)
> > + priv->ids_target_cpu[i] = UINT_MAX;
> > +
> > + /* Reserve ID#0 because it is special and never implemented */
> > + bitmap_set(priv->ids_used_bimap, 0, 1);
> > +
> > + return 0;
> > +}
> > +
> > +static void __init imsic_ids_cleanup(struct imsic_priv *priv)
> > +{
> > + kfree(priv->ids_target_cpu);
> > + kfree(priv->ids_enabled_bimap);
> > + kfree(priv->ids_used_bimap);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > +
> > + if (!handler || !handler->priv || !handler->local.msi_va) {
> > + pr_warn("CPU%d: handler not initialized\n", cpu);
> > + return;
> > + }
> > +
> > + writel(handler->priv->ipi_id, handler->local.msi_va);
> > +}
> > +
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > + __imsic_id_enable(priv->ipi_id);
> > + __imsic_id_enable(priv->ipi_lsync_id);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > + int virq;
> > +
> > + /* Allocate interrupt identity for IPIs */
> > + virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > + if (virq < 0)
> > + return virq;
> > + priv->ipi_id = virq;
> > +
> > + /* Create IMSIC IPI multiplexing */
> > + virq = ipi_mux_create(BITS_PER_BYTE, imsic_ipi_send);
>
> Please! This BITS_PER_BYTE makes zero sense here. Have a proper define
> that says 8, and document *why* this is 8! You're not defining a type
> system, you're writing a irqchip driver.
Okay, I will add a "#define" for the number of IPI with an explanation
for *why*.
>
> > + if (virq <= 0) {
> > + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > + return (virq < 0) ? virq : -ENOMEM;
> > + }
> > +
> > + /* Set vIRQ range */
> > + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, true);
> > +
> > + /* Allocate interrupt identity for local enable/disable sync */
> > + virq = imsic_ids_alloc(priv, priv->global.nr_ids, get_count_order(1));
> > + if (virq < 0) {
> > + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > + return virq;
> > + }
> > + priv->ipi_lsync_id = virq;
> > +
> > + return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > + imsic_ids_free(priv, priv->ipi_lsync_id, get_count_order(1));
> > + if (priv->ipi_id)
> > + imsic_ids_free(priv, priv->ipi_id, get_count_order(1));
> > +}
> > +#else
> > +static void imsic_ipi_enable(struct imsic_priv *priv)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(struct imsic_priv *priv)
> > +{
> > + /* Clear the IPI ids because we are not using IPIs */
> > + priv->ipi_id = 0;
> > + priv->ipi_lsync_id = 0;
> > + return 0;
> > +}
> > +
> > +static void __init imsic_ipi_domain_cleanup(struct imsic_priv *priv)
> > +{
> > +}
> > +#endif
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > + imsic_id_disable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > + imsic_id_enable(irq_data_get_irq_chip_data(d), d->hwirq);
> > +}
> > +
> > +static void imsic_irq_compose_msi_msg(struct irq_data *d,
> > + struct msi_msg *msg)
> > +{
> > + struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > + unsigned int cpu;
> > + int err;
> > +
> > + cpu = imsic_id_get_target(priv, d->hwirq);
> > + WARN_ON(cpu == UINT_MAX);
> > +
> > + err = imsic_get_cpu_msi_msg(cpu, d->hwirq, msg);
> > + WARN_ON(err);
> > +
> > + iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int imsic_irq_set_affinity(struct irq_data *d,
> > + const struct cpumask *mask_val,
> > + bool force)
> > +{
> > + struct imsic_priv *priv = irq_data_get_irq_chip_data(d);
> > + unsigned int target_cpu;
> > + int rc;
> > +
> > + rc = imsic_get_cpu(priv, mask_val, force, &target_cpu);
> > + if (rc)
> > + return rc;
> > +
> > + imsic_id_set_target(priv, d->hwirq, target_cpu);
> > + irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
> > +
> > + return IRQ_SET_MASK_OK;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > + .name = "RISC-V IMSIC-BASE",
> > + .irq_mask = imsic_irq_mask,
> > + .irq_unmask = imsic_irq_unmask,
> > +#ifdef CONFIG_SMP
> > + .irq_set_affinity = imsic_irq_set_affinity,
> > +#endif
> > + .irq_compose_msi_msg = imsic_irq_compose_msi_msg,
> > + .flags = IRQCHIP_SKIP_SET_WAKE |
> > + IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain,
> > + unsigned int virq,
> > + unsigned int nr_irqs,
> > + void *args)
> > +{
> > + struct imsic_priv *priv = domain->host_data;
> > + msi_alloc_info_t *info = args;
> > + phys_addr_t msi_addr;
> > + int i, hwirq, err = 0;
> > + unsigned int cpu;
> > +
> > + err = imsic_get_cpu(priv, &priv->lmask, false, &cpu);
> > + if (err)
> > + return err;
> > +
> > + err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
> > + if (err)
> > + return err;
> > +
> > + hwirq = imsic_ids_alloc(priv, priv->global.nr_ids,
> > + get_count_order(nr_irqs));
> > + if (hwirq < 0)
> > + return hwirq;
> > +
> > + err = iommu_dma_prepare_msi(info->desc, msi_addr);
> > + if (err)
> > + goto fail;
> > +
> > + for (i = 0; i < nr_irqs; i++) {
> > + imsic_id_set_target(priv, hwirq + i, cpu);
> > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > + &imsic_irq_base_chip, priv,
> > + handle_simple_irq, NULL, NULL);
> > + irq_set_noprobe(virq + i);
> > + irq_set_affinity(virq + i, &priv->lmask);
> > + }
> > +
> > + return 0;
> > +
> > +fail:
> > + imsic_ids_free(priv, hwirq, get_count_order(nr_irqs));
> > + return err;
> > +}
> > +
> > +static void imsic_irq_domain_free(struct irq_domain *domain,
> > + unsigned int virq,
> > + unsigned int nr_irqs)
> > +{
> > + struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > + struct imsic_priv *priv = domain->host_data;
> > +
> > + imsic_ids_free(priv, d->hwirq, get_count_order(nr_irqs));
> > + irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> > +}
> > +
> > +static const struct irq_domain_ops imsic_base_domain_ops = {
> > + .alloc = imsic_irq_domain_alloc,
> > + .free = imsic_irq_domain_free,
> > +};
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > + pci_msi_mask_irq(d);
> > + irq_chip_mask_parent(d);
> > +}
> > +
> > +static void imsic_pci_unmask_irq(struct irq_data *d)
> > +{
> > + pci_msi_unmask_irq(d);
> > + irq_chip_unmask_parent(d);
> > +}
> > +
> > +static struct irq_chip imsic_pci_irq_chip = {
> > + .name = "RISC-V IMSIC-PCI",
> > + .irq_mask = imsic_pci_mask_irq,
> > + .irq_unmask = imsic_pci_unmask_irq,
> > + .irq_eoi = irq_chip_eoi_parent,
> > +};
> > +
> > +static struct msi_domain_ops imsic_pci_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_pci_domain_info = {
> > + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> > + MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
> > + .ops = &imsic_pci_domain_ops,
> > + .chip = &imsic_pci_irq_chip,
> > +};
> > +
> > +#endif
> > +
> > +static struct irq_chip imsic_plat_irq_chip = {
> > + .name = "RISC-V IMSIC-PLAT",
> > +};
> > +
> > +static struct msi_domain_ops imsic_plat_domain_ops = {
> > +};
> > +
> > +static struct msi_domain_info imsic_plat_domain_info = {
> > + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
> > + .ops = &imsic_plat_domain_ops,
> > + .chip = &imsic_plat_irq_chip,
> > +};
> > +
> > +static int __init imsic_irq_domains_init(struct imsic_priv *priv,
> > + struct fwnode_handle *fwnode)
> > +{
> > + /* Create Base IRQ domain */
> > + priv->base_domain = irq_domain_create_tree(fwnode,
> > + &imsic_base_domain_ops, priv);
> > + if (!priv->base_domain) {
> > + pr_err("Failed to create IMSIC base domain\n");
> > + return -ENOMEM;
> > + }
> > + irq_domain_update_bus_token(priv->base_domain, DOMAIN_BUS_NEXUS);
> > +
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > + /* Create PCI MSI domain */
> > + priv->pci_domain = pci_msi_create_irq_domain(fwnode,
> > + &imsic_pci_domain_info,
> > + priv->base_domain);
> > + if (!priv->pci_domain) {
> > + pr_err("Failed to create IMSIC PCI domain\n");
> > + irq_domain_remove(priv->base_domain);
> > + return -ENOMEM;
> > + }
> > +#endif
> > +
> > + /* Create Platform MSI domain */
> > + priv->plat_domain = platform_msi_create_irq_domain(fwnode,
> > + &imsic_plat_domain_info,
> > + priv->base_domain);
> > + if (!priv->plat_domain) {
> > + pr_err("Failed to create IMSIC platform domain\n");
> > + if (priv->pci_domain)
> > + irq_domain_remove(priv->pci_domain);
> > + irq_domain_remove(priv->base_domain);
> > + return -ENOMEM;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + struct imsic_priv *priv = handler->priv;
> > + irq_hw_number_t hwirq;
> > + int err;
> > +
> > + WARN_ON_ONCE(!handler->priv);
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
> > + hwirq = hwirq >> TOPEI_ID_SHIFT;
> > +
> > + if (hwirq == priv->ipi_id) {
> > +#ifdef CONFIG_SMP
> > + ipi_mux_process();
> > +#endif
> > + continue;
> > + } else if (hwirq == priv->ipi_lsync_id) {
> > + imsic_ids_local_sync(priv);
> > + continue;
> > + }
> > +
> > + err = generic_handle_domain_irq(priv->base_domain, hwirq);
> > + if (unlikely(err))
> > + pr_warn_ratelimited(
> > + "hwirq %lu mapping not found\n", hwirq);
> > + }
> > +
> > + chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > + struct imsic_priv *priv = handler->priv;
> > +
> > + /* Enable per-CPU parent interrupt */
> > + if (imsic_parent_irq)
> > + enable_percpu_irq(imsic_parent_irq,
> > + irq_get_trigger_type(imsic_parent_irq));
>
> Shouldn't that be the default already?
The imsic_parent_irq is already set before calling imsic_starting_cpu()
on each CPU so we can drop the if-check.
>
> > + else
> > + pr_warn("cpu%d: parent irq not available\n", cpu);
>
> And yet continue in sequence? Duh...
This warning is also not required.
>
> > +
> > + /* Enable IPIs */
> > + imsic_ipi_enable(priv);
> > +
> > + /*
> > + * Interrupts identities might have been enabled/disabled while
> > + * this CPU was not running so sync-up local enable/disable state.
> > + */
> > + imsic_ids_local_sync(priv);
> > +
> > + /* Locally enable interrupt delivery */
> > + imsic_ids_local_delivery(priv, true);
> > +
> > + return 0;
> > +}
> > +
> > +struct imsic_fwnode_ops {
> > + u32 (*nr_parent_irq)(struct fwnode_handle *fwnode,
> > + void *fwopaque);
> > + int (*parent_hartid)(struct fwnode_handle *fwnode,
> > + void *fwopaque, u32 index,
> > + unsigned long *out_hartid);
> > + u32 (*nr_mmio)(struct fwnode_handle *fwnode, void *fwopaque);
> > + int (*mmio_to_resource)(struct fwnode_handle *fwnode,
> > + void *fwopaque, u32 index,
> > + struct resource *res);
> > + void __iomem *(*mmio_map)(struct fwnode_handle *fwnode,
> > + void *fwopaque, u32 index);
> > + int (*read_u32)(struct fwnode_handle *fwnode,
> > + void *fwopaque, const char *prop, u32 *out_val);
> > + bool (*read_bool)(struct fwnode_handle *fwnode,
> > + void *fwopaque, const char *prop);
> > +};
>
> Why do we need this sort of (terrible) indirection?
Okay, I will replace this indirection with fwnode APIs.
>
> > +
> > +static int __init imsic_init(struct imsic_fwnode_ops *fwops,
> > + struct fwnode_handle *fwnode,
> > + void *fwopaque)
> > +{
> > + struct resource res;
> > + phys_addr_t base_addr;
> > + int rc, nr_parent_irqs;
> > + struct imsic_mmio *mmio;
> > + struct imsic_priv *priv;
> > + struct irq_domain *domain;
> > + struct imsic_handler *handler;
> > + struct imsic_global_config *global;
> > + u32 i, tmp, nr_handlers = 0;
> > +
> > + if (imsic_init_done) {
> > + pr_err("%pfwP: already initialized hence ignoring\n",
> > + fwnode);
> > + return -ENODEV;
> > + }
> > +
> > + if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > + pr_err("%pfwP: AIA support not available\n", fwnode);
> > + return -ENODEV;
> > + }
> > +
> > + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > + if (!priv)
> > + return -ENOMEM;
> > + global = &priv->global;
> > +
> > + /* Find number of parent interrupts */
> > + nr_parent_irqs = fwops->nr_parent_irq(fwnode, fwopaque);
> > + if (!nr_parent_irqs) {
> > + pr_err("%pfwP: no parent irqs available\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of guest index bits in MSI address */
> > + rc = fwops->read_u32(fwnode, fwopaque, "riscv,guest-index-bits",
> > + &global->guest_index_bits);
> > + if (rc)
> > + global->guest_index_bits = 0;
> > + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > + if (tmp < global->guest_index_bits) {
> > + pr_err("%pfwP: guest index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of HART index bits */
> > + rc = fwops->read_u32(fwnode, fwopaque, "riscv,hart-index-bits",
> > + &global->hart_index_bits);
> > + if (rc) {
> > + /* Assume default value */
> > + global->hart_index_bits = __fls(nr_parent_irqs);
> > + if (BIT(global->hart_index_bits) < nr_parent_irqs)
> > + global->hart_index_bits++;
> > + }
> > + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > + global->guest_index_bits;
> > + if (tmp < global->hart_index_bits) {
> > + pr_err("%pfwP: HART index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of group index bits */
> > + rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-bits",
> > + &global->group_index_bits);
> > + if (rc)
> > + global->group_index_bits = 0;
> > + tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > + global->guest_index_bits - global->hart_index_bits;
> > + if (tmp < global->group_index_bits) {
> > + pr_err("%pfwP: group index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * Find first bit position of group index.
> > + * If not specified assumed the default APLIC-IMSIC configuration.
> > + */
> > + rc = fwops->read_u32(fwnode, fwopaque, "riscv,group-index-shift",
> > + &global->group_index_shift);
> > + if (rc)
> > + global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > + tmp = global->group_index_bits + global->group_index_shift - 1;
> > + if (tmp >= BITS_PER_LONG) {
> > + pr_err("%pfwP: group index shift too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of interrupt identities */
> > + rc = fwops->read_u32(fwnode, fwopaque, "riscv,num-ids",
> > + &global->nr_ids);
> > + if (rc) {
> > + pr_err("%pfwP: number of interrupt identities not found\n",
> > + fwnode);
> > + return rc;
> > + }
> > + if ((global->nr_ids < IMSIC_MIN_ID) ||
> > + (global->nr_ids >= IMSIC_MAX_ID) ||
> > + ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > + pr_err("%pfwP: invalid number of interrupt identities\n",
> > + fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of guest interrupt identities */
> > + if (fwops->read_u32(fwnode, fwopaque, "riscv,num-guest-ids",
> > + &global->nr_guest_ids))
> > + global->nr_guest_ids = global->nr_ids;
> > + if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > + (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > + ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > + pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > + fwnode);
> > + return -EINVAL;
> > + }
>
> Please split the whole guest stuff out. It is totally unused!
The number of guest IDs is used by KVM RISC-V AIA support which
is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
and imsic_get_local_config(). The "nr_guest_ids" is part of the
IMSIC global config.
>
> I've stopped reading. This needs structure, cleanups and a bit of
> taste. Not a lot of that here at the moment.
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
It took a while to address all your comments and I also got
preempted by other stuff. Sorry for the delay.
Regards,
Anup
On Mon, 01 May 2023 09:28:16 +0100,
Anup Patel <[email protected]> wrote:
>
> On Fri, Jan 13, 2023 at 3:40 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Tue, 03 Jan 2023 14:14:05 +0000,
> > Anup Patel <[email protected]> wrote:
> > >
> > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > a new MSI controller for managing MSIs on a RISC-V platform. This new
> > > MSI controller is referred to as incoming message signaled interrupt
> > > controller (IMSIC) which manages MSI on per-HART (or per-CPU) basis.
> > > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > And how about IPIs, which this driver seems to be concerned about?
>
> Okay, I will mention about IPIs in the commit description.
>
> >
> > >
> > > This patch adds an irqchip driver for RISC-V IMSIC found on RISC-V
> > > platforms.
> > >
> > > Signed-off-by: Anup Patel <[email protected]>
> > > ---
> > > drivers/irqchip/Kconfig | 14 +-
> > > drivers/irqchip/Makefile | 1 +
> > > drivers/irqchip/irq-riscv-imsic.c | 1174 +++++++++++++++++++++++++++
> > > include/linux/irqchip/riscv-imsic.h | 92 +++
> > > 4 files changed, 1280 insertions(+), 1 deletion(-)
> > > create mode 100644 drivers/irqchip/irq-riscv-imsic.c
> > > create mode 100644 include/linux/irqchip/riscv-imsic.h
> > >
> > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > index 9e65345ca3f6..a1315189a595 100644
> > > --- a/drivers/irqchip/Kconfig
> > > +++ b/drivers/irqchip/Kconfig
> > > @@ -29,7 +29,6 @@ config ARM_GIC_V2M
> > >
> > > config GIC_NON_BANKED
> > > bool
> > > -
> > > config ARM_GIC_V3
> > > bool
> > > select IRQ_DOMAIN_HIERARCHY
> > > @@ -548,6 +547,19 @@ config SIFIVE_PLIC
> > > select IRQ_DOMAIN_HIERARCHY
> > > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > >
> > > +config RISCV_IMSIC
> > > + bool
> > > + depends on RISCV
> > > + select IRQ_DOMAIN_HIERARCHY
> > > + select GENERIC_MSI_IRQ_DOMAIN
> > > +
> > > +config RISCV_IMSIC_PCI
> > > + bool
> > > + depends on RISCV_IMSIC
> > > + depends on PCI
> > > + depends on PCI_MSI
> > > + default RISCV_IMSIC
> >
> > This should definitely tell you that this driver needs splitting.
>
> The code under "#ifdef CONFIG_RISCV_IMSIC_PCI" is hardly 40 lines
> so I felt it was too small to deserve its own source file.
It at least needs its own patch.
>
> >
> > > +
> > > config EXYNOS_IRQ_COMBINER
> > > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > index 87b49a10962c..22c723cc6ec8 100644
> > > --- a/drivers/irqchip/Makefile
> > > +++ b/drivers/irqchip/Makefile
> > > @@ -96,6 +96,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > > +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > > obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> > > diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
> > > new file mode 100644
> > > index 000000000000..4c16b66738d6
> > > --- /dev/null
> > > +++ b/drivers/irqchip/irq-riscv-imsic.c
> > > @@ -0,0 +1,1174 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > > +#include <linux/bitmap.h>
> > > +#include <linux/cpu.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/iommu.h>
> > > +#include <linux/irq.h>
> > > +#include <linux/irqchip.h>
> > > +#include <linux/irqchip/chained_irq.h>
> > > +#include <linux/irqchip/riscv-imsic.h>
> > > +#include <linux/irqdomain.h>
> > > +#include <linux/module.h>
> > > +#include <linux/msi.h>
> > > +#include <linux/of.h>
> > > +#include <linux/of_address.h>
> > > +#include <linux/of_irq.h>
> > > +#include <linux/pci.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/spinlock.h>
> > > +#include <linux/smp.h>
> > > +#include <asm/hwcap.h>
> > > +
> > > +#define IMSIC_DISABLE_EIDELIVERY 0
> > > +#define IMSIC_ENABLE_EIDELIVERY 1
> > > +#define IMSIC_DISABLE_EITHRESHOLD 1
> > > +#define IMSIC_ENABLE_EITHRESHOLD 0
> > > +
> > > +#define imsic_csr_write(__c, __v) \
> > > +do { \
> > > + csr_write(CSR_ISELECT, __c); \
> > > + csr_write(CSR_IREG, __v); \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_read(__c) \
> > > +({ \
> > > + unsigned long __v; \
> > > + csr_write(CSR_ISELECT, __c); \
> > > + __v = csr_read(CSR_IREG); \
> > > + __v; \
> > > +})
> > > +
> > > +#define imsic_csr_set(__c, __v) \
> > > +do { \
> > > + csr_write(CSR_ISELECT, __c); \
> > > + csr_set(CSR_IREG, __v); \
> > > +} while (0)
> > > +
> > > +#define imsic_csr_clear(__c, __v) \
> > > +do { \
> > > + csr_write(CSR_ISELECT, __c); \
> > > + csr_clear(CSR_IREG, __v); \
> > > +} while (0)
> > > +
> > > +struct imsic_mmio {
> > > + phys_addr_t pa;
> > > + void __iomem *va;
> > > + unsigned long size;
> > > +};
> > > +
> > > +struct imsic_priv {
> > > + /* Global configuration common for all HARTs */
> > > + struct imsic_global_config global;
> > > +
> > > + /* MMIO regions */
> > > + u32 num_mmios;
> > > + struct imsic_mmio *mmios;
> > > +
> > > + /* Global state of interrupt identities */
> > > + raw_spinlock_t ids_lock;
> > > + unsigned long *ids_used_bimap;
> > > + unsigned long *ids_enabled_bimap;
> > > + unsigned int *ids_target_cpu;
> > > +
> > > + /* Mask for connected CPUs */
> > > + struct cpumask lmask;
> > > +
> > > + /* IPI interrupt identity */
> > > + u32 ipi_id;
> > > + u32 ipi_lsync_id;
> > > +
> > > + /* IRQ domains */
> > > + struct irq_domain *base_domain;
> > > + struct irq_domain *pci_domain;
> > > + struct irq_domain *plat_domain;
> > > +};
> > > +
> > > +struct imsic_handler {
> > > + /* Local configuration for given HART */
> > > + struct imsic_local_config local;
> > > +
> > > + /* Pointer to private context */
> > > + struct imsic_priv *priv;
> > > +};
> > > +
> > > +static bool imsic_init_done;
> > > +
> > > +static int imsic_parent_irq;
> > > +static DEFINE_PER_CPU(struct imsic_handler, imsic_handlers);
> > > +
> > > +const struct imsic_global_config *imsic_get_global_config(void)
> > > +{
> > > + struct imsic_handler *handler = this_cpu_ptr(&imsic_handlers);
> > > +
> > > + if (!handler || !handler->priv)
> > > + return NULL;
> > > +
> > > + return &handler->priv->global;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > > +
> > > +const struct imsic_local_config *imsic_get_local_config(unsigned int cpu)
> > > +{
> > > + struct imsic_handler *handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > +
> > > + if (!handler || !handler->priv)
> > > + return NULL;
> >
> > How can this happen?
>
> These are redundant checks. I will drop.
>
> >
> > > +
> > > + return &handler->local;
> > > +}
> > > +EXPORT_SYMBOL_GPL(imsic_get_local_config);
> >
> > Why are these symbols exported? They have no user, so they shouldn't
> > even exist here. I also seriously doubt there is a valid use case for
> > exposing this information to the rest of the kernel.
>
> The imsic_get_global_config() is used by APLIC driver and KVM RISC-V
> module whereas imsic_get_local_config() is only used by KVM RISC-V.
>
> The KVM RISC-V AIA irqchip patches are available in riscv_kvm_aia_v1
> branch at: https://github.com/avpatel/linux.git. I have not posted KVM RISC-V
> patches due various interdependencies.
Then the symbols can wait, can't they? It'd make more sense if the
KVM-dependent bits were brought together with the KVM patches.
Even better, you'd use some level of abstraction between KVM and the
irqchip code. GIC makes some heavy use of irq_set_vcpu_affinity() as a
private API with KVM, and I'd suggest you look into something similar.
[...]
> > > +#ifdef CONFIG_SMP
> > > +static void __imsic_id_smp_sync(struct imsic_priv *priv)
> > > +{
> > > + struct imsic_handler *handler;
> > > + struct cpumask amask;
> > > + int cpu;
> > > +
> > > + cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> >
> > Can't this race against a CPU going down?
>
> Yes, it can race if a CPU goes down while we are in this function
> but this won't be a problem because the imsic_starting_cpu()
> will unconditionally do imsic_ids_local_sync() when the CPU is
> brought-up again. I will add a multiline comment block explaining
> this.
I'd rather you avoid the race instead of papering over it.
>
> >
> > > + for_each_cpu(cpu, &amask) {
> > > + if (cpu == smp_processor_id())
> > > + continue;
> > > +
> > > + handler = per_cpu_ptr(&imsic_handlers, cpu);
> > > + if (!handler || !handler->priv || !handler->local.msi_va) {
> > > + pr_warn("CPU%d: handler not initialized\n", cpu);
> >
> > How many times are you going to do that? On each failing synchronisation?
>
> My bad for adding these paranoid checks. I remove these checks
> wherever possible.
>
> >
> > > + continue;
> > > + }
> > > +
> > > + writel(handler->priv->ipi_lsync_id, handler->local.msi_va);
> >
> > As I understand it, this is a "behind the scenes" IPI. Why isn't that
> > a *real* IPI?
>
> Yes, that's correct. The ID enable bits are per-CPU accessible only
> via CSRs hence we have a special "behind the scenes" IPI to
> synchronize state of ID enable bits.
My question still stands: why isn't this a *real*, Linux visible IPI?
This sideband signalling makes everything hard to follow, hard to
debug, and screws up accounting.
> > Please split the whole guest stuff out. It is totally unused!
>
> The number of guest IDs is used by KVM RISC-V AIA support which
> is in the pipeline. The KVM RISC-V only need imsic_get_global_config()
> and imsic_get_local_config(). The "nr_guest_ids" is part of the
> IMSIC global config.
And yet it isn't needed for a minimal driver, which what I'd like to
see at first. Shoving the kitchen sink into an initial patch isn't a
great way to get it merged.
M.
--
Without deviation from the norm, progress is not possible.
Hello all,
I have a question regarding the handling of potential issues during the MSI interrupt sending process. It appears that if the APLIC target register's value is modified during the MSI interrupt sending process, it could potentially lead to MSI interrupt send failures. The code doesn't seem to account for this scenario or take appropriate measures.
I am reaching out to seek clarification on whether this situation has been considered and if there are specific reasons for not addressing it in the code. Your insights into this matter would be highly appreciated.
Thank you for your time, and I look forward to your response.
Best regards
On Fri, Nov 24, 2023 at 8:33 AM 谢 波 <[email protected]> wrote:
>
> Hello all,
>
>
> I have a question regarding the handling of potential issues during the MSI interrupt sending process. It appears that if the APLIC target register's value is modified during the MSI interrupt sending process, it could potentially lead to MSI interrupt send failures. The code doesn't seem to account for this scenario or take appropriate measures.
>
> I am reaching out to seek clarification on whether this situation has been considered and if there are specific reasons for not addressing it in the code. Your insights into this matter would be highly appreciated.
>
> Thank you for your time, and I look forward to your response.
This has been taken care of in the IMSIC driver in the irq_set_affinity()
because the IMSIC driver manages the re-writing of MSI messages
upon IRQ affinity changes.
Please look at PATCH7 and PATCH8 of the "[PATCH v11 00/14] Linux
RISC-V AIA Support" series.
(Refer, https://www.spinics.net/lists/devicetree/msg643764.html)
Regards,
Anup
>
> Best regards
>
>
>