2024-02-20 06:07:55

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 00/13] Linux RISC-V AIA Support

The RISC-V AIA specification is ratified as-per the RISC-V international
process. The latest ratified AIA specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
- Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
- Per-HART MSI controller
- Support MSI virtualization
- Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
- Wired interrupt controller
- In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
- In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

This series depends upon per-device MSI domain patches merged by Thomas (tglx)
which are available in irq/msi branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

These patches can also be found in the riscv_aia_v13 branch at:
https://github.com/avpatel/linux.git

Changes since v12:
- Rebased on Linux-6.8-rc5
- Dropped per-device MSI domain patches which are already merged by Thomas (tglx)
- Addressed nit comments from Thomas and Clement
- Added a new patch2 to fix lock dependency warning
- Replaced local sync IPI in the IMSIC driver with per-CPU timer
- Simplified locking in the IMSIC driver to avoid lock dependency issues
- Added a dirty bitmap in the IMSIC driver to optimize per-CPU local sync loop

Changes since v11:
- Rebased on Linux-6.8-rc1
- Included kernel/irq related patches from "genirq, irqchip: Convert ARM
MSI handling to per device MSI domains" series by Thomas.
(PATCH7, PATCH8, PATCH9, PATCH14, PATCH16, PATCH17, PATCH18, PATCH19,
PATCH20, PATCH21, PATCH22, PATCH23, and PATCH32 of
https://lore.kernel.org/linux-arm-kernel/[email protected]/)
- Updated APLIC MSI-mode driver to use the new WIRED_TO_MSI mechanism.
- Updated IMSIC driver to support per-device MSI domains for PCI and
platform devices.

Changes since v10:
- Rebased on Linux-6.6-rc7
- Dropped PATCH3 of v10 series since this has been merged by MarcZ
for Linux-6.6-rc7
- Changed the IMSIC ID management strategy from 1-n approach to
x86-style 1-1 approach

Changes since v9:
- Rebased on Linux-6.6-rc4
- Use builtin_platform_driver() in PATCH5, PATCH9, and PATCH12

Changes since v8:
- Rebased on Linux-6.6-rc3
- Dropped PATCH2 of v8 series since we won't be requiring
riscv_get_intc_hartid() based on Marc Z's comments on ACPI AIA support.
- Addressed Saravana's comments in PATCH3 of v8 series
- Update PATCH9 and PATCH13 of v8 series based on comments from Sunil

Changes since v7:
- Rebased on Linux-6.6-rc1
- Addressed comments on PATCH1 of v7 series and split it into two PATCHes
- Use DEFINE_SIMPLE_PROP() in PATCH2 of v7 series

Changes since v6:
- Rebased on Linux-6.5-rc4
- Updated PATCH2 to use IS_ENABLED(CONFIG_SPARC) instead of
!IS_ENABLED(CONFIG_OF_IRQ)
- Added new PATCH4 to fix syscore registration in PLIC driver
- Update PATCH5 to convert PLIC driver into full-blown platform driver
with a re-written probe function.

Changes since v5:
- Rebased on Linux-6.5-rc2
- Updated the overall series to ensure that only IPI, timer, and
INTC drivers are probed very early whereas rest of the interrupt
controllers (such as PLIC, APLIC, and IMISC) are probed as
regular platform drivers.
- Renamed riscv_fw_parent_hartid() to riscv_get_intc_hartid()
- New PATCH1 to add fw_devlink support for msi-parent DT property
- New PATCH2 to ensure all INTC suppliers are initialized which in-turn
fixes the probing issue for PLIC, APLIC and IMSIC as platform driver
- New PATCH3 to use platform driver probing for PLIC
- Re-structured the IMSIC driver into two separate drivers: early and
platform. The IMSIC early driver (PATCH7) only initialized IMSIC state
and provides IPIs whereas the IMSIC platform driver (PATCH8) is probed
provides MSI domain for platform devices.
- Re-structure the APLIC platform driver into three separe sources: main,
direct mode, and MSI mode.

Changes since v4:
- Rebased on Linux-6.5-rc1
- Added "Dependencies" in the APLIC bindings (PATCH6 in v4)
- Dropped the PATCH6 which was changing the IOMMU DMA domain APIs
- Dropped use of IOMMU DMA APIs in the IMSIC driver (PATCH4)

Changes since v3:
- Rebased on Linux-6.4-rc6
- Dropped PATCH2 of v3 series instead we now set FWNODE_FLAG_BEST_EFFORT via
IRQCHIP_DECLARE()
- Extend riscv_fw_parent_hartid() to support both DT and ACPI in PATCH1
- Extend iommu_dma_compose_msi_msg() instead of adding iommu_dma_select_msi()
in PATCH6
- Addressed Conor's comments in PATCH3
- Addressed Conor's and Rob's comments in PATCH7

Changes since v2:
- Rebased on Linux-6.4-rc1
- Addressed Rob's comments on DT bindings patches 4 and 8.
- Addessed Marc's comments on IMSIC driver PATCH5
- Replaced use of OF apis in APLIC and IMSIC drivers with FWNODE apis
this makes both drivers easily portable for ACPI support. This also
removes unnecessary indirection from the APLIC and IMSIC drivers.
- PATCH1 is a new patch for portability with ACPI support
- PATCH2 is a new patch to fix probing in APLIC drivers for APLIC-only systems.
- PATCH7 is a new patch which addresses the IOMMU DMA domain issues pointed
out by SiFive

Changes since v1:
- Rebased on Linux-6.2-rc2
- Addressed comments on IMSIC DT bindings for PATCH4
- Use raw_spin_lock_irqsave() on ids_lock for PATCH5
- Improved MMIO alignment checks in PATCH5 to allow MMIO regions
with holes.
- Addressed comments on APLIC DT bindings for PATCH6
- Fixed warning splat in aplic_msi_write_msg() caused by
zeroed MSI message in PATCH7
- Dropped DT property riscv,slow-ipi instead will have module
parameter in future.

Anup Patel (12):
irqchip/sifive-plic: Convert PLIC driver into a platform driver
irqchip/sifive-plic: Improve locking safety by using
irqsave/irqrestore
irqchip/riscv-intc: Add support for RISC-V AIA
dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
irqchip: Add RISC-V incoming MSI controller early driver
irqchip/riscv-imsic: Add device MSI domain support for platform
devices
irqchip/riscv-imsic: Add device MSI domain support for PCI devices
dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
irqchip: Add RISC-V advanced PLIC driver for direct-mode
irqchip/riscv-aplic: Add support for MSI-mode
RISC-V: Select APLIC and IMSIC drivers
MAINTAINERS: Add entry for RISC-V AIA drivers

Björn Töpel (1):
genirq/matrix: Dynamic bitmap allocation

.../interrupt-controller/riscv,aplic.yaml | 172 ++++
.../interrupt-controller/riscv,imsics.yaml | 172 ++++
MAINTAINERS | 14 +
arch/riscv/Kconfig | 2 +
arch/x86/include/asm/hw_irq.h | 2 -
drivers/irqchip/Kconfig | 25 +
drivers/irqchip/Makefile | 3 +
drivers/irqchip/irq-riscv-aplic-direct.c | 324 +++++++
drivers/irqchip/irq-riscv-aplic-main.c | 211 ++++
drivers/irqchip/irq-riscv-aplic-main.h | 52 +
drivers/irqchip/irq-riscv-aplic-msi.c | 263 +++++
drivers/irqchip/irq-riscv-imsic-early.c | 213 ++++
drivers/irqchip/irq-riscv-imsic-platform.c | 378 ++++++++
drivers/irqchip/irq-riscv-imsic-state.c | 906 ++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 99 ++
drivers/irqchip/irq-riscv-intc.c | 34 +-
drivers/irqchip/irq-sifive-plic.c | 285 ++++--
include/linux/irqchip/riscv-aplic.h | 145 +++
include/linux/irqchip/riscv-imsic.h | 87 ++
kernel/irq/matrix.c | 28 +-
20 files changed, 3298 insertions(+), 117 deletions(-)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
create mode 100644 drivers/irqchip/irq-riscv-aplic-direct.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.h
create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
create mode 100644 include/linux/irqchip/riscv-aplic.h
create mode 100644 include/linux/irqchip/riscv-imsic.h

--
2.34.1



2024-02-20 06:08:09

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 01/13] irqchip/sifive-plic: Convert PLIC driver into a platform driver

The PLIC driver does not require very early initialization so let
us convert it into a platform driver.

As part of the conversion, the PLIC probing undergoes the following
changes:
1. Use dev_info(), dev_err() and dev_warn() instead of pr_info(),
pr_err() and pr_warn()
2. Use devm_xyz() APIs wherever applicable
3. PLIC is now probed after CPUs are brought-up so we have to
setup cpuhp state after context handler of all online CPUs
are initialized otherwise we see crash on multi-socket systems

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 269 ++++++++++++++++++++----------
1 file changed, 177 insertions(+), 92 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 5b7bc4fd9517..48483a1a41dd 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -3,7 +3,6 @@
* Copyright (C) 2017 SiFive
* Copyright (C) 2018 Christoph Hellwig
*/
-#define pr_fmt(fmt) "plic: " fmt
#include <linux/cpu.h>
#include <linux/interrupt.h>
#include <linux/io.h>
@@ -64,6 +63,7 @@
#define PLIC_QUIRK_EDGE_INTERRUPT 0

struct plic_priv {
+ struct device *dev;
struct cpumask lmask;
struct irq_domain *irqdomain;
void __iomem *regs;
@@ -371,7 +371,8 @@ static void plic_handle_irq(struct irq_desc *desc)
int err = generic_handle_domain_irq(handler->priv->irqdomain,
hwirq);
if (unlikely(err))
- pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
+ dev_warn_ratelimited(handler->priv->dev,
+ "can't find mapping for hwirq %lu\n",
hwirq);
}

@@ -406,57 +407,126 @@ static int plic_starting_cpu(unsigned int cpu)
return 0;
}

-static int __init __plic_init(struct device_node *node,
- struct device_node *parent,
- unsigned long plic_quirks)
+static const struct of_device_id plic_match[] = {
+ { .compatible = "sifive,plic-1.0.0" },
+ { .compatible = "riscv,plic0" },
+ { .compatible = "andestech,nceplic100",
+ .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+ { .compatible = "thead,c900-plic",
+ .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+ {}
+};
+
+static int plic_parse_nr_irqs_and_contexts(struct platform_device *pdev,
+ u32 *nr_irqs, u32 *nr_contexts)
{
- int error = 0, nr_contexts, nr_handlers = 0, i;
- u32 nr_irqs;
- struct plic_priv *priv;
+ struct device *dev = &pdev->dev;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_property_read_u32(to_of_node(dev->fwnode),
+ "riscv,ndev", nr_irqs);
+ if (rc) {
+ dev_err(dev, "riscv,ndev property not available\n");
+ return rc;
+ }
+
+ *nr_contexts = of_irq_count(to_of_node(dev->fwnode));
+ if (WARN_ON(!(*nr_contexts))) {
+ dev_err(dev, "no PLIC context available\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int plic_parse_context_parent_hwirq(struct platform_device *pdev,
+ u32 context, u32 *parent_hwirq,
+ unsigned long *parent_hartid)
+{
+ struct device *dev = &pdev->dev;
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(dev->fwnode), context, &parent);
+ if (rc)
+ return rc;
+
+ rc = riscv_of_parent_hartid(parent.np, parent_hartid);
+ if (rc)
+ return rc;
+
+ *parent_hwirq = parent.args[0];
+ return 0;
+}
+
+static int plic_probe(struct platform_device *pdev)
+{
+ int rc, nr_contexts, nr_handlers = 0, i, cpu;
+ unsigned long plic_quirks = 0, hartid;
+ struct device *dev = &pdev->dev;
struct plic_handler *handler;
- unsigned int cpu;
+ u32 nr_irqs, parent_hwirq;
+ struct irq_domain *domain;
+ struct plic_priv *priv;
+ irq_hw_number_t hwirq;
+ struct resource *res;
+ bool cpuhp_setup;
+
+ if (is_of_node(dev->fwnode)) {
+ const struct of_device_id *id;
+
+ id = of_match_node(plic_match, to_of_node(dev->fwnode));
+ if (id)
+ plic_quirks = (unsigned long)id->data;
+ }

- priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
-
+ priv->dev = dev;
priv->plic_quirks = plic_quirks;

- priv->regs = of_iomap(node, 0);
- if (WARN_ON(!priv->regs)) {
- error = -EIO;
- goto out_free_priv;
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(dev, "failed to get MMIO resource\n");
+ return -EINVAL;
+ }
+ priv->regs = devm_ioremap(dev, res->start, resource_size(res));
+ if (!priv->regs) {
+ dev_err(dev, "failed map MMIO registers\n");
+ return -EIO;
}

- error = -EINVAL;
- of_property_read_u32(node, "riscv,ndev", &nr_irqs);
- if (WARN_ON(!nr_irqs))
- goto out_iounmap;
-
+ rc = plic_parse_nr_irqs_and_contexts(pdev, &nr_irqs, &nr_contexts);
+ if (rc) {
+ dev_err(dev, "failed to parse irqs and contexts\n");
+ return rc;
+ }
priv->nr_irqs = nr_irqs;

- priv->prio_save = bitmap_alloc(nr_irqs, GFP_KERNEL);
+ priv->prio_save = devm_bitmap_zalloc(dev, nr_irqs, GFP_KERNEL);
if (!priv->prio_save)
- goto out_free_priority_reg;
-
- nr_contexts = of_irq_count(node);
- if (WARN_ON(!nr_contexts))
- goto out_free_priority_reg;
-
- error = -ENOMEM;
- priv->irqdomain = irq_domain_add_linear(node, nr_irqs + 1,
- &plic_irqdomain_ops, priv);
- if (WARN_ON(!priv->irqdomain))
- goto out_free_priority_reg;
+ return -ENOMEM;

for (i = 0; i < nr_contexts; i++) {
- struct of_phandle_args parent;
- irq_hw_number_t hwirq;
- int cpu;
- unsigned long hartid;
-
- if (of_irq_parse_one(node, i, &parent)) {
- pr_err("failed to parse parent for context %d.\n", i);
+ rc = plic_parse_context_parent_hwirq(pdev, i,
+ &parent_hwirq, &hartid);
+ if (rc) {
+ dev_warn(dev, "hwirq for context%d not found\n", i);
continue;
}

@@ -464,7 +534,7 @@ static int __init __plic_init(struct device_node *node,
* Skip contexts other than external interrupts for our
* privilege level.
*/
- if (parent.args[0] != RV_IRQ_EXT) {
+ if (parent_hwirq != RV_IRQ_EXT) {
/* Disable S-mode enable bits if running in M-mode. */
if (IS_ENABLED(CONFIG_RISCV_M_MODE)) {
void __iomem *enable_base = priv->regs +
@@ -477,21 +547,17 @@ static int __init __plic_init(struct device_node *node,
continue;
}

- error = riscv_of_parent_hartid(parent.np, &hartid);
- if (error < 0) {
- pr_warn("failed to parse hart ID for context %d.\n", i);
- continue;
- }
-
cpu = riscv_hartid_to_cpuid(hartid);
if (cpu < 0) {
- pr_warn("Invalid cpuid for context %d\n", i);
+ dev_warn(dev, "Invalid cpuid for context %d\n", i);
continue;
}

/* Find parent domain and register chained handler */
- if (!plic_parent_irq && irq_find_host(parent.np)) {
- plic_parent_irq = irq_of_parse_and_map(node, i);
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!plic_parent_irq && domain) {
+ plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
if (plic_parent_irq)
irq_set_chained_handler(plic_parent_irq,
plic_handle_irq);
@@ -504,7 +570,7 @@ static int __init __plic_init(struct device_node *node,
*/
handler = per_cpu_ptr(&plic_handlers, cpu);
if (handler->present) {
- pr_warn("handler already present for context %d.\n", i);
+ dev_warn(dev, "handler already present for context%d.\n", i);
plic_set_threshold(handler, PLIC_DISABLE_THRESHOLD);
goto done;
}
@@ -518,10 +584,15 @@ static int __init __plic_init(struct device_node *node,
i * CONTEXT_ENABLE_SIZE;
handler->priv = priv;

- handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),
- sizeof(*handler->enable_save), GFP_KERNEL);
- if (!handler->enable_save)
- goto out_free_enable_reg;
+ handler->enable_save = devm_kcalloc(dev,
+ DIV_ROUND_UP(nr_irqs, 32),
+ sizeof(*handler->enable_save),
+ GFP_KERNEL);
+ if (!handler->enable_save) {
+ rc = -ENOMEM;
+ goto fail_cleanup_contexts;
+ }
+
done:
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
plic_toggle(handler, hwirq, 0);
@@ -531,52 +602,66 @@ static int __init __plic_init(struct device_node *node,
nr_handlers++;
}

+ priv->irqdomain = irq_domain_create_linear(dev->fwnode, nr_irqs + 1,
+ &plic_irqdomain_ops, priv);
+ if (WARN_ON(!priv->irqdomain)) {
+ rc = -ENOMEM;
+ goto fail_cleanup_contexts;
+ }
+
/*
* We can have multiple PLIC instances so setup cpuhp state
- * and register syscore operations only when context handler
- * for current/boot CPU is present.
+ * and register syscore operations only once after context
+ * handlers of all online CPUs are initialized.
*/
- handler = this_cpu_ptr(&plic_handlers);
- if (handler->present && !plic_cpuhp_setup_done) {
- cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
- "irqchip/sifive/plic:starting",
- plic_starting_cpu, plic_dying_cpu);
- register_syscore_ops(&plic_irq_syscore_ops);
- plic_cpuhp_setup_done = true;
+ if (!plic_cpuhp_setup_done) {
+ cpuhp_setup = true;
+ for_each_online_cpu(cpu) {
+ handler = per_cpu_ptr(&plic_handlers, cpu);
+ if (!handler->present) {
+ cpuhp_setup = false;
+ break;
+ }
+ }
+ if (cpuhp_setup) {
+ cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
+ "irqchip/sifive/plic:starting",
+ plic_starting_cpu, plic_dying_cpu);
+ register_syscore_ops(&plic_irq_syscore_ops);
+ plic_cpuhp_setup_done = true;
+ }
}

- pr_info("%pOFP: mapped %d interrupts with %d handlers for"
- " %d contexts.\n", node, nr_irqs, nr_handlers, nr_contexts);
+ dev_info(dev, "mapped %d interrupts with %d handlers for %d contexts.\n",
+ nr_irqs, nr_handlers, nr_contexts);
return 0;

-out_free_enable_reg:
- for_each_cpu(cpu, cpu_present_mask) {
+fail_cleanup_contexts:
+ for (i = 0; i < nr_contexts; i++) {
+ if (plic_parse_context_parent_hwirq(pdev, i,
+ &parent_hwirq, &hartid))
+ continue;
+ if (parent_hwirq != RV_IRQ_EXT)
+ continue;
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0)
+ continue;
+
handler = per_cpu_ptr(&plic_handlers, cpu);
- kfree(handler->enable_save);
+ handler->present = false;
+ handler->hart_base = NULL;
+ handler->enable_base = NULL;
+ handler->enable_save = NULL;
+ handler->priv = NULL;
}
-out_free_priority_reg:
- kfree(priv->prio_save);
-out_iounmap:
- iounmap(priv->regs);
-out_free_priv:
- kfree(priv);
- return error;
+ return rc;
}

-static int __init plic_init(struct device_node *node,
- struct device_node *parent)
-{
- return __plic_init(node, parent, 0);
-}
-
-IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
-IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
-
-static int __init plic_edge_init(struct device_node *node,
- struct device_node *parent)
-{
- return __plic_init(node, parent, BIT(PLIC_QUIRK_EDGE_INTERRUPT));
-}
-
-IRQCHIP_DECLARE(andestech_nceplic100, "andestech,nceplic100", plic_edge_init);
-IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", plic_edge_init);
+static struct platform_driver plic_driver = {
+ .driver = {
+ .name = "riscv-plic",
+ .of_match_table = plic_match,
+ },
+ .probe = plic_probe,
+};
+builtin_platform_driver(plic_driver);
--
2.34.1


2024-02-20 06:08:23

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 02/13] irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore

Now that PLIC driver is probed as a regular platform driver, the lock
dependency validator complains about the safety of handler->enable_lock
usage:

[ 0.956775] Possible interrupt unsafe locking scenario:

[ 0.956998] CPU0 CPU1
[ 0.957247] ---- ----
[ 0.957439] lock(&handler->enable_lock);
[ 0.957607] local_irq_disable();
[ 0.957793] lock(&irq_desc_lock_class);
[ 0.958021] lock(&handler->enable_lock);
[ 0.958246] <Interrupt>
[ 0.958342] lock(&irq_desc_lock_class);
[ 0.958501]
*** DEADLOCK ***

To address above, let's use raw_spin_lock_irqsave/unlock_irqrestore()
instead of raw_spin_lock/unlock().

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 48483a1a41dd..e91077ff171f 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -103,9 +103,11 @@ static void __plic_toggle(void __iomem *enable_base, int hwirq, int enable)

static void plic_toggle(struct plic_handler *handler, int hwirq, int enable)
{
- raw_spin_lock(&handler->enable_lock);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&handler->enable_lock, flags);
__plic_toggle(handler->enable_base, hwirq, enable);
- raw_spin_unlock(&handler->enable_lock);
+ raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
}

static inline void plic_irq_toggle(const struct cpumask *mask,
@@ -236,6 +238,7 @@ static int plic_irq_set_type(struct irq_data *d, unsigned int type)
static int plic_irq_suspend(void)
{
unsigned int i, cpu;
+ unsigned long flags;
u32 __iomem *reg;
struct plic_priv *priv;

@@ -253,12 +256,12 @@ static int plic_irq_suspend(void)
if (!handler->present)
continue;

- raw_spin_lock(&handler->enable_lock);
+ raw_spin_lock_irqsave(&handler->enable_lock, flags);
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
reg = handler->enable_base + i * sizeof(u32);
handler->enable_save[i] = readl(reg);
}
- raw_spin_unlock(&handler->enable_lock);
+ raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
}

return 0;
@@ -267,6 +270,7 @@ static int plic_irq_suspend(void)
static void plic_irq_resume(void)
{
unsigned int i, index, cpu;
+ unsigned long flags;
u32 __iomem *reg;
struct plic_priv *priv;

@@ -284,12 +288,12 @@ static void plic_irq_resume(void)
if (!handler->present)
continue;

- raw_spin_lock(&handler->enable_lock);
+ raw_spin_lock_irqsave(&handler->enable_lock, flags);
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
reg = handler->enable_base + i * sizeof(u32);
writel(handler->enable_save[i], reg);
}
- raw_spin_unlock(&handler->enable_lock);
+ raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
}
}

--
2.34.1


2024-02-20 06:08:37

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 03/13] irqchip/riscv-intc: Add support for RISC-V AIA

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

We add support for #1 and #2 described above in the RISC-V intc driver.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-riscv-intc.c | 34 ++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index e8d01b14ccdd..bab536bbaf2c 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/of.h>
#include <linux/smp.h>
+#include <asm/hwcap.h>

static struct irq_domain *intc_domain;

@@ -30,6 +31,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
generic_handle_domain_irq(intc_domain, cause);
}

+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+ unsigned long topi;
+
+ while ((topi = csr_read(CSR_TOPI)))
+ generic_handle_domain_irq(intc_domain,
+ topi >> TOPI_IID_SHIFT);
+}
+
/*
* On RISC-V systems local interrupts are masked or unmasked by writing
* the SIE (Supervisor Interrupt Enable) CSR. As CSRs can only be written
@@ -39,12 +49,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)

static void riscv_intc_irq_mask(struct irq_data *d)
{
- csr_clear(CSR_IE, BIT(d->hwirq));
+ if (IS_ENABLED(CONFIG_32BIT) && d->hwirq >= BITS_PER_LONG)
+ csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
+ else
+ csr_clear(CSR_IE, BIT(d->hwirq));
}

static void riscv_intc_irq_unmask(struct irq_data *d)
{
- csr_set(CSR_IE, BIT(d->hwirq));
+ if (IS_ENABLED(CONFIG_32BIT) && d->hwirq >= BITS_PER_LONG)
+ csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
+ else
+ csr_set(CSR_IE, BIT(d->hwirq));
}

static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,16 +131,20 @@ static struct fwnode_handle *riscv_intc_hwnode(void)

static int __init riscv_intc_init_common(struct fwnode_handle *fn)
{
- int rc;
+ int rc, nr_irqs = riscv_isa_extension_available(NULL, SxAIA) ?
+ 64 : BITS_PER_LONG;

- intc_domain = irq_domain_create_linear(fn, BITS_PER_LONG,
+ intc_domain = irq_domain_create_linear(fn, nr_irqs,
&riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
}

- rc = set_handle_irq(&riscv_intc_irq);
+ if (riscv_isa_extension_available(NULL, SxAIA))
+ rc = set_handle_irq(&riscv_intc_aia_irq);
+ else
+ rc = set_handle_irq(&riscv_intc_irq);
if (rc) {
pr_err("failed to set irq handler\n");
return rc;
@@ -132,7 +152,9 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)

riscv_set_intc_hwnode_fn(riscv_intc_hwnode);

- pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+ pr_info("%d local interrupts mapped%s\n",
+ nr_irqs, riscv_isa_extension_available(NULL, SxAIA) ?
+ " using AIA" : "");

return 0;
}
--
2.34.1


2024-02-20 06:08:50

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 04/13] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
---
.../interrupt-controller/riscv,imsics.yaml | 172 ++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..84976f17a4a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description: |
+ The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+ MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+ AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+ The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+ for each privilege level (machine or supervisor). The configuration of
+ a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+ space to receive MSIs from devices. Each IMSIC interrupt file supports a
+ fixed number of interrupt identities (to distinguish MSIs from devices)
+ which is same for given privilege level across CPUs (or HARTs).
+
+ The device tree of a RISC-V platform will have one IMSIC device tree node
+ for each privilege level (machine or supervisor) which collectively describe
+ IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+ The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+ follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+ group is a set of IMSIC interrupt files co-located in MMIO space and we can
+ have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+ RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+ privilege level (machine or supervisor) encodes group index, HART index,
+ and guest index (shown below).
+
+ XLEN-1 > (HART Index MSB) 12 0
+ | | | |
+ -------------------------------------------------------------
+ |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ -------------------------------------------------------------
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+ - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,imsics
+ - const: riscv,imsics
+
+ reg:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Base address of each IMSIC group.
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 0
+
+ msi-controller: true
+
+ "#msi-cells":
+ const: 0
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ This property represents the set of CPUs (or HARTs) for which given
+ device tree node describes the IMSIC interrupt files. Each node pointed
+ to should be a riscv,cpu-intc node, which has a CPU node (i.e. RISC-V
+ HART) as parent.
+
+ riscv,num-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities supported by IMSIC interrupt file.
+
+ riscv,num-guest-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities are supported by IMSIC guest interrupt
+ file. When not specified it is assumed to be same as specified by the
+ riscv,num-ids property.
+
+ riscv,guest-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of guest index bits in the MSI target address.
+
+ riscv,hart-index-bits:
+ minimum: 0
+ maximum: 15
+ description:
+ Number of HART index bits in the MSI target address. When not
+ specified it is calculated based on the interrupts-extended property.
+
+ riscv,group-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of group index bits in the MSI target address.
+
+ riscv,group-index-shift:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 0
+ maximum: 55
+ default: 24
+ description:
+ The least significant bit position of the group index bits in the
+ MSI target address.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - msi-controller
+ - "#msi-cells"
+ - interrupts-extended
+ - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (Machine-level IMSIC files with just one group):
+
+ interrupt-controller@24000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0x28000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ };
+
+ - |
+ // Example 2 (Supervisor-level IMSIC files with two groups):
+
+ interrupt-controller@28000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>,
+ <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+ <0x29000000 0x2000>; /* Group1 IMSICs */
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ riscv,group-index-bits = <1>;
+ riscv,group-index-shift = <24>;
+ };
+...
--
2.34.1


2024-02-20 06:09:06

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 05/13] genirq/matrix: Dynamic bitmap allocation

From: Björn Töpel <[email protected]>

Some (future) users of the irq matrix allocator, do not know the size
of the matrix bitmaps at compile time.

To avoid wasting memory on unnecessary large bitmaps, size the bitmap
at matrix allocation time.

Signed-off-by: Björn Töpel <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
arch/x86/include/asm/hw_irq.h | 2 --
kernel/irq/matrix.c | 28 +++++++++++++++++-----------
2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index b02c3cd3c0f6..edebf1020e04 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -16,8 +16,6 @@

#include <asm/irq_vectors.h>

-#define IRQ_MATRIX_BITS NR_VECTORS
-
#ifndef __ASSEMBLY__

#include <linux/percpu.h>
diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
index 75d0ae490e29..8f222d1cccec 100644
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -8,8 +8,6 @@
#include <linux/cpu.h>
#include <linux/irq.h>

-#define IRQ_MATRIX_SIZE (BITS_TO_LONGS(IRQ_MATRIX_BITS))
-
struct cpumap {
unsigned int available;
unsigned int allocated;
@@ -17,8 +15,8 @@ struct cpumap {
unsigned int managed_allocated;
bool initialized;
bool online;
- unsigned long alloc_map[IRQ_MATRIX_SIZE];
- unsigned long managed_map[IRQ_MATRIX_SIZE];
+ unsigned long *managed_map;
+ unsigned long alloc_map[];
};

struct irq_matrix {
@@ -32,8 +30,8 @@ struct irq_matrix {
unsigned int total_allocated;
unsigned int online_maps;
struct cpumap __percpu *maps;
- unsigned long scratch_map[IRQ_MATRIX_SIZE];
- unsigned long system_map[IRQ_MATRIX_SIZE];
+ unsigned long *system_map;
+ unsigned long scratch_map[];
};

#define CREATE_TRACE_POINTS
@@ -50,24 +48,32 @@ __init struct irq_matrix *irq_alloc_matrix(unsigned int matrix_bits,
unsigned int alloc_start,
unsigned int alloc_end)
{
+ unsigned int cpu, matrix_size = BITS_TO_LONGS(matrix_bits);
struct irq_matrix *m;

- if (matrix_bits > IRQ_MATRIX_BITS)
- return NULL;
-
- m = kzalloc(sizeof(*m), GFP_KERNEL);
+ m = kzalloc(struct_size(m, scratch_map, matrix_size * 2), GFP_KERNEL);
if (!m)
return NULL;

+ m->system_map = &m->scratch_map[matrix_size];
+
m->matrix_bits = matrix_bits;
m->alloc_start = alloc_start;
m->alloc_end = alloc_end;
m->alloc_size = alloc_end - alloc_start;
- m->maps = alloc_percpu(*m->maps);
+ m->maps = __alloc_percpu(struct_size(m->maps, alloc_map, matrix_size * 2),
+ __alignof__(*m->maps));
if (!m->maps) {
kfree(m);
return NULL;
}
+
+ for_each_possible_cpu(cpu) {
+ struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+
+ cm->managed_map = &cm->alloc_map[matrix_size];
+ }
+
return m;
}

--
2.34.1


2024-02-20 06:09:21

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

The RISC-V advanced interrupt architecture (AIA) specification
defines a new MSI controller called incoming message signalled
interrupt controller (IMSIC) which manages MSI on per-HART (or
per-CPU) basis. It also supports IPIs as software injected MSIs.
(For more details refer https://github.com/riscv/riscv-aia)

Let us add an early irqchip driver for RISC-V IMSIC which sets
up the IMSIC state and provide IPIs.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 7 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-imsic-early.c | 213 ++++++
drivers/irqchip/irq-riscv-imsic-state.c | 906 ++++++++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 98 +++
include/linux/irqchip/riscv-imsic.h | 87 +++
6 files changed, 1312 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index f7149d0f3d45..85f86e31c996 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -546,6 +546,13 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_IMSIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_IRQ_MATRIX_ALLOCATOR
+ select GENERIC_MSI_IRQ
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index ffd945fe71aa..d714724387ce 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
new file mode 100644
index 000000000000..32fe428b1c19
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-imsic-state.h"
+
+static int imsic_parent_irq;
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+ struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
+
+ writel_relaxed(IMSIC_IPI_ID, local->msi_va);
+}
+
+static void imsic_ipi_starting_cpu(void)
+{
+ /* Enable IPIs for current CPU. */
+ __imsic_id_set_enable(IMSIC_IPI_ID);
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+ /* Disable IPIs for current CPU. */
+ __imsic_id_clear_enable(IMSIC_IPI_ID);
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ int virq;
+
+ /* Create IMSIC IPI multiplexing */
+ virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
+ if (virq <= 0)
+ return (virq < 0) ? virq : -ENOMEM;
+
+ /* Set vIRQ range */
+ riscv_ipi_set_virq_range(virq, IMSIC_NR_IPI, true);
+
+ /* Announce that IMSIC is providing IPIs */
+ pr_info("%pfwP: providing IPIs using interrupt %d\n", imsic->fwnode, IMSIC_IPI_ID);
+
+ return 0;
+}
+#else
+static void imsic_ipi_starting_cpu(void)
+{
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ return 0;
+}
+#endif
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ int err, cpu = smp_processor_id();
+ struct imsic_vector *vec;
+ unsigned long local_id;
+
+ chained_irq_enter(chip, desc);
+
+ while ((local_id = csr_swap(CSR_TOPEI, 0))) {
+ local_id = local_id >> TOPEI_ID_SHIFT;
+
+ if (local_id == IMSIC_IPI_ID) {
+#ifdef CONFIG_SMP
+ ipi_mux_process();
+#endif
+ continue;
+ }
+
+ if (unlikely(!imsic->base_domain))
+ continue;
+
+ vec = imsic_vector_from_local_id(cpu, local_id);
+ if (!vec) {
+ pr_warn_ratelimited("vector not found for local ID 0x%lx\n", local_id);
+ continue;
+ }
+
+ err = generic_handle_domain_irq(imsic->base_domain,
+ vec->hwirq);
+ if (unlikely(err))
+ pr_warn_ratelimited("hwirq 0x%x mapping not found\n", vec->hwirq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+ /* Mark per-CPU IMSIC state as online */
+ imsic_state_online();
+
+ /* Enable per-CPU parent interrupt */
+ enable_percpu_irq(imsic_parent_irq, irq_get_trigger_type(imsic_parent_irq));
+
+ /* Setup IPIs */
+ imsic_ipi_starting_cpu();
+
+ /*
+ * Interrupts identities might have been enabled/disabled while
+ * this CPU was not running so sync-up local enable/disable state.
+ */
+ imsic_local_sync_all();
+
+ /* Enable local interrupt delivery */
+ imsic_local_delivery(true);
+
+ return 0;
+}
+
+static int imsic_dying_cpu(unsigned int cpu)
+{
+ /* Cleanup IPIs */
+ imsic_ipi_dying_cpu();
+
+ /* Mark per-CPU IMSIC state as offline */
+ imsic_state_offline();
+
+ return 0;
+}
+
+static int __init imsic_early_probe(struct fwnode_handle *fwnode)
+{
+ struct irq_domain *domain;
+ int rc;
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
+ if (!domain) {
+ pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+ return -ENOENT;
+ }
+ imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (!imsic_parent_irq) {
+ pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+ return -ENOENT;
+ }
+
+ /* Initialize IPI domain */
+ rc = imsic_ipi_domain_init();
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+ return rc;
+ }
+
+ /* Setup chained handler to the parent domain interrupt */
+ irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+ /*
+ * Setup cpuhp state (must be done after setting imsic_parent_irq)
+ *
+ * Don't disable per-CPU IMSIC file when CPU goes offline
+ * because this affects IPI and the masking/unmasking of
+ * virtual IPIs is done via generic IPI-Mux
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "irqchip/riscv/imsic:starting",
+ imsic_starting_cpu, imsic_dying_cpu);
+
+ return 0;
+}
+
+static int __init imsic_early_dt_init(struct device_node *node,
+ struct device_node *parent)
+{
+ struct fwnode_handle *fwnode = &node->fwnode;
+ int rc;
+
+ /* Setup IMSIC state */
+ rc = imsic_setup_state(fwnode);
+ if (rc) {
+ pr_err("%pfwP: failed to setup state (error %d)\n",
+ fwnode, rc);
+ return rc;
+ }
+
+ /* Do early setup of IPIs */
+ rc = imsic_early_probe(fwnode);
+ if (rc)
+ return rc;
+
+ /* Ensure that OF platform device gets probed */
+ of_node_clear_flag(node, OF_POPULATED);
+ return 0;
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
new file mode 100644
index 000000000000..4f347486ec7c
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-state.c
@@ -0,0 +1,906 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/cpu.h>
+#include <linux/bitmap.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/seq_file.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#include "irq-riscv-imsic-state.h"
+
+#define IMSIC_DISABLE_EIDELIVERY 0
+#define IMSIC_ENABLE_EIDELIVERY 1
+#define IMSIC_DISABLE_EITHRESHOLD 1
+#define IMSIC_ENABLE_EITHRESHOLD 0
+
+static inline void imsic_csr_write(unsigned long reg, unsigned long val)
+{
+ csr_write(CSR_ISELECT, reg);
+ csr_write(CSR_IREG, val);
+}
+
+static inline unsigned long imsic_csr_read(unsigned long reg)
+{
+ csr_write(CSR_ISELECT, reg);
+ return csr_read(CSR_IREG);
+}
+
+static inline unsigned long imsic_csr_read_clear(unsigned long reg, unsigned long val)
+{
+ csr_write(CSR_ISELECT, reg);
+ return csr_read_clear(CSR_IREG, val);
+}
+
+static inline void imsic_csr_set(unsigned long reg, unsigned long val)
+{
+ csr_write(CSR_ISELECT, reg);
+ csr_set(CSR_IREG, val);
+}
+
+static inline void imsic_csr_clear(unsigned long reg, unsigned long val)
+{
+ csr_write(CSR_ISELECT, reg);
+ csr_clear(CSR_IREG, val);
+}
+
+struct imsic_priv *imsic;
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return imsic ? &imsic->global : NULL;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+static bool __imsic_eix_read_clear(unsigned long id, bool pend)
+{
+ unsigned long isel, imask;
+
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += pend ? IMSIC_EIP0 : IMSIC_EIE0;
+ imask = BIT(id & (__riscv_xlen - 1));
+
+ return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;
+}
+
+static inline bool __imsic_id_read_clear_enabled(unsigned long id)
+{
+ return __imsic_eix_read_clear(id, false);
+}
+
+static inline bool __imsic_id_read_clear_pending(unsigned long id)
+{
+ return __imsic_eix_read_clear(id, true);
+}
+
+void __imsic_eix_update(unsigned long base_id, unsigned long num_id, bool pend, bool val)
+{
+ unsigned long id = base_id, last_id = base_id + num_id;
+ unsigned long i, isel, ireg;
+
+ while (id < last_id) {
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+ /*
+ * Prepare the ID mask to be programmed in the
+ * IMSIC EIEx and EIPx registers. These registers
+ * are XLEN-wide and we must not touch IDs which
+ * are < base_id and >= (base_id + num_id).
+ */
+ ireg = 0;
+ for (i = id & (__riscv_xlen - 1); (id < last_id) && (i < __riscv_xlen); i++) {
+ ireg |= BIT(i);
+ id++;
+ }
+
+ /*
+ * The IMSIC EIEx and EIPx registers are indirectly
+ * accessed via using ISELECT and IREG CSRs so we
+ * need to access these CSRs without getting preempted.
+ *
+ * All existing users of this function call this
+ * function with local IRQs disabled so we don't
+ * need to do anything special here.
+ */
+ if (val)
+ imsic_csr_set(isel, ireg);
+ else
+ imsic_csr_clear(isel, ireg);
+ }
+}
+
+/* MUST be called with lpriv->lock held */
+static void __imsic_local_sync(struct imsic_local_priv *lpriv)
+{
+ struct imsic_local_config *mlocal;
+ struct imsic_vector *vec, *mvec;
+ int i;
+
+ /* This pairs with the barrier in __imsic_remote_sync(). */
+ smp_mb();
+
+ for_each_set_bit(i, lpriv->dirty_bitmap, imsic->global.nr_ids + 1) {
+ if (!i || i == IMSIC_IPI_ID)
+ goto skip;
+ vec = &lpriv->vectors[i];
+
+ if (vec->enable)
+ __imsic_id_set_enable(i);
+ else
+ __imsic_id_clear_enable(i);
+
+ /*
+ * If the ID was being moved to a new ID on some other CPU
+ * then we can get a MSI during the movement so check the
+ * ID pending bit and re-trigger the new ID on other CPU
+ * using MMIO write.
+ */
+ mvec = vec->move;
+ vec->move = NULL;
+ if (mvec && mvec != vec) {
+ if (__imsic_id_read_clear_pending(i)) {
+ mlocal = per_cpu_ptr(imsic->global.local, mvec->cpu);
+ writel_relaxed(mvec->local_id, mlocal->msi_va);
+ }
+
+ imsic_vector_free(&lpriv->vectors[i]);
+ }
+
+skip:
+ bitmap_clear(lpriv->dirty_bitmap, i, 1);
+ }
+}
+
+void imsic_local_sync_all(void)
+{
+ struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+ bitmap_fill(lpriv->dirty_bitmap, imsic->global.nr_ids + 1);
+ __imsic_local_sync(lpriv);
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+}
+
+void imsic_local_delivery(bool enable)
+{
+ if (enable) {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+ return;
+ }
+
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_local_timer_callback(struct timer_list *timer)
+{
+ struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+ __imsic_local_sync(lpriv);
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+}
+
+/* MUST be called with lpriv->lock held */
+static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
+{
+ /*
+ * Ensure that changes to vector enable, vector move and
+ * dirty bitmap are visible to the target CPU.
+ *
+ * This pairs with the barrier in __imsic_local_sync().
+ */
+ smp_mb();
+
+ /*
+ * We schedule a timer on the target CPU if the target CPU is not
+ * same as the current CPU. An offline CPU will unconditionally
+ * synchronize IDs through imsic_starting_cpu() when the
+ * CPU is brought up.
+ */
+ if (cpu_online(cpu)) {
+ if (cpu != smp_processor_id()) {
+ if (!timer_pending(&lpriv->timer)) {
+ lpriv->timer.expires = jiffies + 1;
+ add_timer_on(&lpriv->timer, cpu);
+ }
+ } else {
+ __imsic_local_sync(lpriv);
+ }
+ }
+}
+#else
+/* MUST be called with lpriv->lock held */
+static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
+{
+ __imsic_local_sync(lpriv);
+}
+#endif
+
+void imsic_vector_mask(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ /*
+ * This function is called through Linux irq subsystem with
+ * irqs disabled so no need to save/restore irq flags.
+ */
+
+ raw_spin_lock(&lpriv->lock);
+
+ vec->enable = false;
+ bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
+ __imsic_remote_sync(lpriv, vec->cpu);
+
+ raw_spin_unlock(&lpriv->lock);
+}
+
+void imsic_vector_unmask(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ /*
+ * This function is called through Linux irq subsystem with
+ * irqs disabled so no need to save/restore irq flags.
+ */
+
+ raw_spin_lock(&lpriv->lock);
+
+ vec->enable = true;
+ bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
+ __imsic_remote_sync(lpriv, vec->cpu);
+
+ raw_spin_unlock(&lpriv->lock);
+}
+
+
+bool imsic_vector_isenabled(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ unsigned long flags;
+ bool ret;
+
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+ ret = vec->enable;
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+
+ return ret;
+}
+
+struct imsic_vector *imsic_vector_get_move(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ struct imsic_vector *ret;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+ ret = vec->move;
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+
+ return ret;
+}
+
+static bool imsic_vector_move_update(struct imsic_local_priv *lpriv, struct imsic_vector *vec,
+ bool new_enable, struct imsic_vector *new_move)
+{
+ unsigned long flags;
+ bool enabled;
+
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+
+ /* Update enable and move details */
+ enabled = vec->enable;
+ vec->enable = new_enable;
+ vec->move = new_move;
+
+ /* Mark the vector as dirty and synchronize */
+ bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
+ __imsic_remote_sync(lpriv, vec->cpu);
+
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+
+ return enabled;
+}
+
+void imsic_vector_move(struct imsic_vector *old_vec, struct imsic_vector *new_vec)
+{
+ struct imsic_local_priv *old_lpriv, *new_lpriv;
+ bool enabled;
+
+ if (WARN_ON(old_vec->cpu == new_vec->cpu))
+ return;
+
+ old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
+ if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
+ return;
+
+ new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
+ if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
+ return;
+
+ /*
+ * Move and re-trigger the new vector based on the pending
+ * state of the old vector because we might get a device
+ * interrupt on the old vector while device was being moved
+ * to the new vector.
+ */
+ enabled = imsic_vector_move_update(old_lpriv, old_vec, false, new_vec);
+ imsic_vector_move_update(new_lpriv, new_vec, enabled, new_vec);
+}
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+void imsic_vector_debug_show(struct seq_file *m, struct imsic_vector *vec, int ind)
+{
+ struct imsic_local_priv *lpriv;
+ struct imsic_vector *mvec;
+ bool is_enabled;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ is_enabled = imsic_vector_isenabled(vec);
+ mvec = imsic_vector_get_move(vec);
+
+ seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
+ seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
+ seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
+ (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
+ seq_printf(m, "%*sis_enabled : %5u\n", ind, "", (is_enabled) ? 1 : 0);
+ seq_printf(m, "%*sis_move_pending : %5u\n", ind, "", (mvec) ? 1 : 0);
+ if (mvec) {
+ seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mvec->cpu);
+ seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mvec->local_id);
+ }
+}
+
+void imsic_vector_debug_show_summary(struct seq_file *m, int ind)
+{
+ irq_matrix_debug_show(m, imsic->matrix, ind);
+}
+#endif
+
+struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu, unsigned int local_id)
+{
+ struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ if (!lpriv || imsic->global.nr_ids < local_id)
+ return NULL;
+
+ return &lpriv->vectors[local_id];
+}
+
+struct imsic_vector *imsic_vector_alloc(unsigned int hwirq, const struct cpumask *mask)
+{
+ struct imsic_vector *vec = NULL;
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+ unsigned int cpu;
+ int local_id;
+
+ raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
+ local_id = irq_matrix_alloc(imsic->matrix, mask, false, &cpu);
+ raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
+ if (local_id < 0)
+ return NULL;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+ vec = &lpriv->vectors[local_id];
+ vec->hwirq = hwirq;
+ vec->enable = false;
+ vec->move = NULL;
+
+ return vec;
+}
+
+void imsic_vector_free(struct imsic_vector *vec)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
+ vec->hwirq = UINT_MAX;
+ irq_matrix_free(imsic->matrix, vec->cpu, vec->local_id, false);
+ raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
+}
+
+static void __init imsic_local_cleanup(void)
+{
+ int cpu;
+ struct imsic_local_priv *lpriv;
+
+ for_each_possible_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ bitmap_free(lpriv->dirty_bitmap);
+ kfree(lpriv->vectors);
+ }
+
+ free_percpu(imsic->lpriv);
+}
+
+static int __init imsic_local_init(void)
+{
+ struct imsic_global_config *global = &imsic->global;
+ struct imsic_local_priv *lpriv;
+ struct imsic_vector *vec;
+ int cpu, i;
+
+ /* Allocate per-CPU private state */
+ imsic->lpriv = alloc_percpu(typeof(*(imsic->lpriv)));
+ if (!imsic->lpriv)
+ return -ENOMEM;
+
+ /* Setup per-CPU private state */
+ for_each_possible_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ raw_spin_lock_init(&lpriv->lock);
+
+ /* Allocate dirty bitmap */
+ lpriv->dirty_bitmap = bitmap_zalloc(global->nr_ids + 1, GFP_KERNEL);
+ if (!lpriv->dirty_bitmap)
+ goto fail_local_cleanup;
+
+#ifdef CONFIG_SMP
+ /* Setup lazy timer for synchronization */
+ timer_setup(&lpriv->timer, imsic_local_timer_callback, TIMER_PINNED);
+#endif
+
+ /* Allocate vector array */
+ lpriv->vectors = kcalloc(global->nr_ids + 1, sizeof(*lpriv->vectors),
+ GFP_KERNEL);
+ if (!lpriv->vectors)
+ goto fail_local_cleanup;
+
+ /* Setup vector array */
+ for (i = 0; i <= global->nr_ids; i++) {
+ vec = &lpriv->vectors[i];
+ vec->cpu = cpu;
+ vec->local_id = i;
+ vec->hwirq = UINT_MAX;
+ }
+ }
+
+ return 0;
+
+fail_local_cleanup:
+ imsic_local_cleanup();
+ return -ENOMEM;
+}
+
+void imsic_state_online(void)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
+ irq_matrix_online(imsic->matrix);
+ raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
+}
+
+void imsic_state_offline(void)
+{
+#ifdef CONFIG_SMP
+ struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
+#endif
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
+ irq_matrix_offline(imsic->matrix);
+ raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
+
+#ifdef CONFIG_SMP
+ raw_spin_lock_irqsave(&lpriv->lock, flags);
+ WARN_ON_ONCE(try_to_del_timer_sync(&lpriv->timer) < 0);
+ raw_spin_unlock_irqrestore(&lpriv->lock, flags);
+#endif
+}
+
+static int __init imsic_matrix_init(void)
+{
+ struct imsic_global_config *global = &imsic->global;
+
+ raw_spin_lock_init(&imsic->matrix_lock);
+ imsic->matrix = irq_alloc_matrix(global->nr_ids + 1,
+ 0, global->nr_ids + 1);
+ if (!imsic->matrix)
+ return -ENOMEM;
+
+ /* Reserve ID#0 because it is special and never implemented */
+ irq_matrix_assign_system(imsic->matrix, 0, false);
+
+ /* Reserve IPI ID because it is special and used internally */
+ irq_matrix_assign_system(imsic->matrix, IMSIC_IPI_ID, false);
+
+ return 0;
+}
+
+static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
+ u32 index, unsigned long *hartid)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
+ if (rc)
+ return rc;
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (parent.args[0] != RV_IRQ_EXT)
+ return -EINVAL;
+
+ return riscv_of_parent_hartid(parent.np, hartid);
+}
+
+static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
+ u32 index, struct resource *res)
+{
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static int __init imsic_parse_fwnode(struct fwnode_handle *fwnode,
+ struct imsic_global_config *global,
+ u32 *nr_parent_irqs,
+ u32 *nr_mmios)
+{
+ unsigned long hartid;
+ struct resource res;
+ int rc;
+ u32 i;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ *nr_parent_irqs = 0;
+ *nr_mmios = 0;
+
+ /* Find number of parent interrupts */
+ while (!imsic_get_parent_hartid(fwnode, *nr_parent_irqs, &hartid))
+ (*nr_parent_irqs)++;
+ if (!(*nr_parent_irqs)) {
+ pr_err("%pfwP: no parent irqs available\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of guest index bits in MSI address */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,guest-index-bits",
+ &global->guest_index_bits);
+ if (rc)
+ global->guest_index_bits = 0;
+
+ /* Find number of HART index bits */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,hart-index-bits",
+ &global->hart_index_bits);
+ if (rc) {
+ /* Assume default value */
+ global->hart_index_bits = __fls(*nr_parent_irqs);
+ if (BIT(global->hart_index_bits) < *nr_parent_irqs)
+ global->hart_index_bits++;
+ }
+
+ /* Find number of group index bits */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-bits",
+ &global->group_index_bits);
+ if (rc)
+ global->group_index_bits = 0;
+
+ /*
+ * Find first bit position of group index.
+ * If not specified assumed the default APLIC-IMSIC configuration.
+ */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-shift",
+ &global->group_index_shift);
+ if (rc)
+ global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+
+ /* Find number of interrupt identities */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-ids",
+ &global->nr_ids);
+ if (rc) {
+ pr_err("%pfwP: number of interrupt identities not found\n",
+ fwnode);
+ return rc;
+ }
+
+ /* Find number of guest interrupt identities */
+ rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-guest-ids",
+ &global->nr_guest_ids);
+ if (rc)
+ global->nr_guest_ids = global->nr_ids;
+
+ /* Sanity check guest index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+ if (i < global->guest_index_bits) {
+ pr_err("%pfwP: guest index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check HART index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
+ if (i < global->hart_index_bits) {
+ pr_err("%pfwP: HART index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check group index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+ global->guest_index_bits - global->hart_index_bits;
+ if (i < global->group_index_bits) {
+ pr_err("%pfwP: group index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check group index shift */
+ i = global->group_index_bits + global->group_index_shift - 1;
+ if (i >= BITS_PER_LONG) {
+ pr_err("%pfwP: group index shift too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check number of interrupt identities */
+ if ((global->nr_ids < IMSIC_MIN_ID) ||
+ (global->nr_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check number of guest interrupt identities */
+ if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+ (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of guest interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Compute base address */
+ rc = imsic_get_mmio_resource(fwnode, 0, &res);
+ if (rc) {
+ pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+ return -EINVAL;
+ }
+ global->base_addr = res.start;
+ global->base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+
+ /* Find number of MMIO register sets */
+ while (!imsic_get_mmio_resource(fwnode, *nr_mmios, &res))
+ (*nr_mmios)++;
+
+ return 0;
+}
+
+int __init imsic_setup_state(struct fwnode_handle *fwnode)
+{
+ u32 i, j, index, nr_parent_irqs, nr_mmios, nr_handlers = 0;
+ struct imsic_global_config *global;
+ struct imsic_local_config *local;
+ void __iomem **mmios_va = NULL;
+ struct resource *mmios = NULL;
+ unsigned long reloff, hartid;
+ phys_addr_t base_addr;
+ int rc, cpu;
+
+ /*
+ * Only one IMSIC instance allowed in a platform for clean
+ * implementation of SMP IRQ affinity and per-CPU IPIs.
+ *
+ * This means on a multi-socket (or multi-die) platform we
+ * will have multiple MMIO regions for one IMSIC instance.
+ */
+ if (imsic) {
+ pr_err("%pfwP: already initialized hence ignoring\n",
+ fwnode);
+ return -EALREADY;
+ }
+
+ if (!riscv_isa_extension_available(NULL, SxAIA)) {
+ pr_err("%pfwP: AIA support not available\n", fwnode);
+ return -ENODEV;
+ }
+
+ imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
+ if (!imsic)
+ return -ENOMEM;
+ imsic->fwnode = fwnode;
+ global = &imsic->global;
+
+ global->local = alloc_percpu(typeof(*(global->local)));
+ if (!global->local) {
+ rc = -ENOMEM;
+ goto out_free_priv;
+ }
+
+ /* Parse IMSIC fwnode */
+ rc = imsic_parse_fwnode(fwnode, global, &nr_parent_irqs, &nr_mmios);
+ if (rc)
+ goto out_free_local;
+
+ /* Allocate MMIO resource array */
+ mmios = kcalloc(nr_mmios, sizeof(*mmios), GFP_KERNEL);
+ if (!mmios) {
+ rc = -ENOMEM;
+ goto out_free_local;
+ }
+
+ /* Allocate MMIO virtual address array */
+ mmios_va = kcalloc(nr_mmios, sizeof(*mmios_va), GFP_KERNEL);
+ if (!mmios_va) {
+ rc = -ENOMEM;
+ goto out_iounmap;
+ }
+
+ /* Parse and map MMIO register sets */
+ for (i = 0; i < nr_mmios; i++) {
+ rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
+ if (rc) {
+ pr_err("%pfwP: unable to parse MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ base_addr = mmios[i].start;
+ base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+ if (base_addr != global->base_addr) {
+ rc = -EINVAL;
+ pr_err("%pfwP: address mismatch for regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ mmios_va[i] = ioremap(mmios[i].start, resource_size(&mmios[i]));
+ if (!mmios_va[i]) {
+ rc = -EIO;
+ pr_err("%pfwP: unable to map MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+ }
+
+ /* Initialize local (or per-CPU )state */
+ rc = imsic_local_init();
+ if (rc) {
+ pr_err("%pfwP: failed to initialize local state\n",
+ fwnode);
+ goto out_iounmap;
+ }
+
+ /* Configure handlers for target CPUs */
+ for (i = 0; i < nr_parent_irqs; i++) {
+ rc = imsic_get_parent_hartid(fwnode, i, &hartid);
+ if (rc) {
+ pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+ fwnode, i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ /* Find MMIO location of MSI page */
+ index = nr_mmios;
+ reloff = i * BIT(global->guest_index_bits) *
+ IMSIC_MMIO_PAGE_SZ;
+ for (j = 0; nr_mmios; j++) {
+ if (reloff < resource_size(&mmios[j])) {
+ index = j;
+ break;
+ }
+
+ /*
+ * MMIO region size may not be aligned to
+ * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+ * if holes are present.
+ */
+ reloff -= ALIGN(resource_size(&mmios[j]),
+ BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+ }
+ if (index >= nr_mmios) {
+ pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ local = per_cpu_ptr(global->local, cpu);
+ local->msi_pa = mmios[index].start + reloff;
+ local->msi_va = mmios_va[index] + reloff;
+
+ nr_handlers++;
+ }
+
+ /* If no CPU handlers found then can't take interrupts */
+ if (!nr_handlers) {
+ pr_err("%pfwP: No CPU handlers found\n", fwnode);
+ rc = -ENODEV;
+ goto out_local_cleanup;
+ }
+
+ /* Initialize matrix allocator */
+ rc = imsic_matrix_init();
+ if (rc) {
+ pr_err("%pfwP: failed to create matrix allocator\n",
+ fwnode);
+ goto out_local_cleanup;
+ }
+
+ /* We don't need MMIO arrays anymore so let's free-up */
+ kfree(mmios_va);
+ kfree(mmios);
+
+ return 0;
+
+out_local_cleanup:
+ imsic_local_cleanup();
+out_iounmap:
+ for (i = 0; i < nr_mmios; i++) {
+ if (mmios_va[i])
+ iounmap(mmios_va[i]);
+ }
+ kfree(mmios_va);
+ kfree(mmios);
+out_free_local:
+ free_percpu(imsic->global.local);
+out_free_priv:
+ kfree(imsic);
+ imsic = NULL;
+ return rc;
+}
diff --git a/drivers/irqchip/irq-riscv-imsic-state.h b/drivers/irqchip/irq-riscv-imsic-state.h
new file mode 100644
index 000000000000..f0c983db99eb
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-state.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#ifndef _IRQ_RISCV_IMSIC_STATE_H
+#define _IRQ_RISCV_IMSIC_STATE_H
+
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/fwnode.h>
+#include <linux/timer.h>
+
+#define IMSIC_IPI_ID 1
+#define IMSIC_NR_IPI 8
+
+struct imsic_vector {
+ /* Fixed details of the vector */
+ unsigned int cpu;
+ unsigned int local_id;
+ /* Details saved by driver in the vector */
+ unsigned int hwirq;
+ /* Details accessed using local lock held */
+ bool enable;
+ struct imsic_vector *move;
+};
+
+struct imsic_local_priv {
+ /* Local lock to protect vector enable/move variables and dirty bitmap */
+ raw_spinlock_t lock;
+
+ /* Local dirty bitmap for synchronization */
+ unsigned long *dirty_bitmap;
+
+#ifdef CONFIG_SMP
+ /* Local timer for synchronization */
+ struct timer_list timer;
+#endif
+
+ /* Local vector table */
+ struct imsic_vector *vectors;
+};
+
+struct imsic_priv {
+ /* Device details */
+ struct fwnode_handle *fwnode;
+
+ /* Global configuration common for all HARTs */
+ struct imsic_global_config global;
+
+ /* Per-CPU state */
+ struct imsic_local_priv __percpu *lpriv;
+
+ /* State of IRQ matrix allocator */
+ raw_spinlock_t matrix_lock;
+ struct irq_matrix *matrix;
+
+ /* IRQ domains (created by platform driver) */
+ struct irq_domain *base_domain;
+};
+
+extern struct imsic_priv *imsic;
+
+void __imsic_eix_update(unsigned long base_id, unsigned long num_id, bool pend, bool val);
+
+static inline void __imsic_id_set_enable(unsigned long id)
+{
+ __imsic_eix_update(id, 1, false, true);
+}
+
+static inline void __imsic_id_clear_enable(unsigned long id)
+{
+ __imsic_eix_update(id, 1, false, false);
+}
+
+void imsic_local_sync_all(void);
+void imsic_local_delivery(bool enable);
+
+void imsic_vector_mask(struct imsic_vector *vec);
+void imsic_vector_unmask(struct imsic_vector *vec);
+bool imsic_vector_isenabled(struct imsic_vector *vec);
+struct imsic_vector *imsic_vector_get_move(struct imsic_vector *vec);
+void imsic_vector_move(struct imsic_vector *old_vec, struct imsic_vector *new_vec);
+
+struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu, unsigned int local_id);
+
+struct imsic_vector *imsic_vector_alloc(unsigned int hwirq, const struct cpumask *mask);
+void imsic_vector_free(struct imsic_vector *vector);
+
+void imsic_vector_debug_show(struct seq_file *m, struct imsic_vector *vec, int ind);
+void imsic_vector_debug_show_summary(struct seq_file *m, int ind);
+
+void imsic_state_online(void);
+void imsic_state_offline(void);
+int imsic_setup_state(struct fwnode_handle *fwnode);
+
+#endif
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..b997eb277b5b
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT 12
+#define IMSIC_MMIO_PAGE_SZ BIT(IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE 0x00
+#define IMSIC_MMIO_PAGE_BE 0x04
+
+#define IMSIC_MIN_ID 63
+#define IMSIC_MAX_ID 2048
+
+#define IMSIC_EIDELIVERY 0x70
+
+#define IMSIC_EITHRESHOLD 0x72
+
+#define IMSIC_EIP0 0x80
+#define IMSIC_EIP63 0xbf
+#define IMSIC_EIPx_BITS 32
+
+#define IMSIC_EIE0 0xc0
+#define IMSIC_EIE63 0xff
+#define IMSIC_EIEx_BITS 32
+
+#define IMSIC_FIRST IMSIC_EIDELIVERY
+#define IMSIC_LAST IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE 0x00
+#define IMSIC_MMIO_SETIPNUM_BE 0x04
+
+struct imsic_local_config {
+ phys_addr_t msi_pa;
+ void __iomem *msi_va;
+};
+
+struct imsic_global_config {
+ /*
+ * MSI Target Address Scheme
+ *
+ * XLEN-1 12 0
+ * | | |
+ * -------------------------------------------------------------
+ * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ * -------------------------------------------------------------
+ */
+
+ /* Bits representing Guest index, HART index, and Group index */
+ u32 guest_index_bits;
+ u32 hart_index_bits;
+ u32 group_index_bits;
+ u32 group_index_shift;
+
+ /* Global base address matching all target MSI addresses */
+ phys_addr_t base_addr;
+
+ /* Number of interrupt identities */
+ u32 nr_ids;
+
+ /* Number of guest interrupt identities */
+ u32 nr_guest_ids;
+
+ /* Per-CPU IMSIC addresses */
+ struct imsic_local_config __percpu *local;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return NULL;
+}
+
+#endif
+
+#endif
--
2.34.1


2024-02-20 06:09:44

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 08/13] irqchip/riscv-imsic: Add device MSI domain support for PCI devices

The Linux PCI framework supports per-device MSI domains for PCI devices
so let us extend the IMSIC driver to allow per-device MSI domains for
PCI devices.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 7 +++++
drivers/irqchip/irq-riscv-imsic-platform.c | 36 ++++++++++++++++++++--
2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 85f86e31c996..2fc0cb32341a 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -553,6 +553,13 @@ config RISCV_IMSIC
select GENERIC_IRQ_MATRIX_ALLOCATOR
select GENERIC_MSI_IRQ

+config RISCV_IMSIC_PCI
+ bool
+ depends on RISCV_IMSIC
+ depends on PCI
+ depends on PCI_MSI
+ default RISCV_IMSIC
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
index 7ee44c493dbc..37f47375d5b7 100644
--- a/drivers/irqchip/irq-riscv-imsic-platform.c
+++ b/drivers/irqchip/irq-riscv-imsic-platform.c
@@ -14,6 +14,7 @@
#include <linux/irqdomain.h>
#include <linux/module.h>
#include <linux/msi.h>
+#include <linux/pci.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
@@ -209,6 +210,28 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
#endif
};

+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+ irq_chip_unmask_parent(d);
+ pci_msi_unmask_irq(d);
+}
+
+#define MATCH_PCI_MSI BIT(DOMAIN_BUS_PCI_MSI)
+
+#else
+
+#define MATCH_PCI_MSI 0
+
+#endif
+
static bool imsic_init_dev_msi_info(struct device *dev,
struct irq_domain *domain,
struct irq_domain *real_parent,
@@ -218,6 +241,7 @@ static bool imsic_init_dev_msi_info(struct device *dev,

/* MSI parent domain specific settings */
switch (real_parent->bus_token) {
+ case DOMAIN_BUS_PCI_MSI:
case DOMAIN_BUS_NEXUS:
if (WARN_ON_ONCE(domain != real_parent))
return false;
@@ -232,6 +256,13 @@ static bool imsic_init_dev_msi_info(struct device *dev,

/* Is the target supported? */
switch (info->bus_token) {
+#ifdef CONFIG_RISCV_IMSIC_PCI
+ case DOMAIN_BUS_PCI_DEVICE_MSI:
+ case DOMAIN_BUS_PCI_DEVICE_MSIX:
+ info->chip->irq_mask = imsic_pci_mask_irq;
+ info->chip->irq_unmask = imsic_pci_unmask_irq;
+ break;
+#endif
case DOMAIN_BUS_DEVICE_MSI:
/*
* Per-device MSI should never have any MSI feature bits
@@ -271,11 +302,12 @@ static bool imsic_init_dev_msi_info(struct device *dev,
#define MATCH_PLATFORM_MSI BIT(DOMAIN_BUS_PLATFORM_MSI)

static const struct msi_parent_ops imsic_msi_parent_ops = {
- .supported_flags = MSI_GENERIC_FLAGS_MASK,
+ .supported_flags = MSI_GENERIC_FLAGS_MASK |
+ MSI_FLAG_PCI_MSIX,
.required_flags = MSI_FLAG_USE_DEF_DOM_OPS |
MSI_FLAG_USE_DEF_CHIP_OPS,
.bus_select_token = DOMAIN_BUS_NEXUS,
- .bus_select_mask = MATCH_PLATFORM_MSI,
+ .bus_select_mask = MATCH_PCI_MSI | MATCH_PLATFORM_MSI,
.init_dev_msi_info = imsic_init_dev_msi_info,
};

--
2.34.1


2024-02-20 06:10:02

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 09/13] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC

We add DT bindings document for RISC-V advanced platform level interrupt
controller (APLIC) defined by the RISC-V advanced interrupt architecture
(AIA) specification.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
.../interrupt-controller/riscv,aplic.yaml | 172 ++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..190a6499c932
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description:
+ The RISC-V advanced interrupt architecture (AIA) defines an advanced
+ platform level interrupt controller (APLIC) for handling wired interrupts
+ in a RISC-V platform. The RISC-V AIA specification can be found at
+ https://github.com/riscv/riscv-aia.
+
+ The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+ interrupt sources connect to the root APLIC domain and a parent APLIC
+ domain can delegate interrupt sources to it's child APLIC domains. There
+ is one device tree node for each APLIC domain.
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,aplic
+ - const: riscv,aplic
+
+ reg:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 2
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Given APLIC domain directly injects external interrupts to a set of
+ RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+ node, which has a CPU node (i.e. RISC-V HART) as parent.
+
+ msi-parent:
+ description:
+ Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+ message signaled interrupt controller (IMSIC). If both "msi-parent" and
+ "interrupts-extended" properties are present then it means the APLIC
+ domain supports both MSI mode and Direct mode in HW. In this case, the
+ APLIC driver has to choose between MSI mode or Direct mode.
+
+ riscv,num-sources:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 1023
+ description:
+ Specifies the number of wired interrupt sources supported by this
+ APLIC domain.
+
+ riscv,children:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ maxItems: 1
+ description:
+ A list of child APLIC domains for the given APLIC domain. Each child
+ APLIC domain is assigned a child index in increasing order, with the
+ first child APLIC domain assigned child index 0. The APLIC domain child
+ index is used by firmware to delegate interrupts from the given APLIC
+ domain to a particular child APLIC domain.
+
+ riscv,delegation:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ items:
+ - description: child APLIC domain phandle
+ - description: first interrupt number of the parent APLIC domain (inclusive)
+ - description: last interrupt number of the parent APLIC domain (inclusive)
+ description:
+ A interrupt delegation list where each entry is a triple consisting
+ of child APLIC domain phandle, first interrupt number of the parent
+ APLIC domain, and last interrupt number of the parent APLIC domain.
+ Firmware must configure interrupt delegation registers based on
+ interrupt delegation list.
+
+dependencies:
+ riscv,delegation: [ "riscv,children" ]
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - "#interrupt-cells"
+ - riscv,num-sources
+
+anyOf:
+ - required:
+ - interrupts-extended
+ - required:
+ - msi-parent
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0xc000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic1>, <&aplic2>;
+ riscv,delegation = <&aplic1 1 63>;
+ };
+
+ aplic1: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>;
+ reg = <0xd000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ aplic2: interrupt-controller@e000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0xe000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ - |
+ // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_mlevel>;
+ reg = <0xc000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic3>;
+ riscv,delegation = <&aplic3 1 63>;
+ };
+
+ aplic3: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_slevel>;
+ reg = <0xd000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+...
--
2.34.1


2024-02-20 06:10:25

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 11/13] irqchip/riscv-aplic: Add support for MSI-mode

The RISC-V advanced platform-level interrupt controller (APLIC) has
two modes of operation: 1) Direct mode and 2) MSI mode.
(For more details, refer https://github.com/riscv/riscv-aia)

In APLIC MSI-mode, wired interrupts are forwared as message signaled
interrupts (MSIs) to CPUs via IMSIC.

Extend the existing APLIC irqchip driver to support MSI-mode for
RISC-V platforms having both wired interrupts and MSIs.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 6 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
drivers/irqchip/irq-riscv-aplic-main.h | 8 +
drivers/irqchip/irq-riscv-aplic-msi.c | 263 +++++++++++++++++++++++++
5 files changed, 279 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index dbc8811d3764..806b5fccb3e8 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -551,6 +551,12 @@ config RISCV_APLIC
depends on RISCV
select IRQ_DOMAIN_HIERARCHY

+config RISCV_APLIC_MSI
+ bool
+ depends on RISCV_APLIC
+ select GENERIC_MSI_IRQ
+ default RISCV_APLIC
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 7f8289790ed8..47995fdb2c60 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
index e6617147daff..302f4ed7c075 100644
--- a/drivers/irqchip/irq-riscv-aplic-main.c
+++ b/drivers/irqchip/irq-riscv-aplic-main.c
@@ -187,7 +187,7 @@ static int aplic_probe(struct platform_device *pdev)
if (is_of_node(dev->fwnode))
msi_mode = of_property_present(to_of_node(dev->fwnode), "msi-parent");
if (msi_mode)
- rc = -ENODEV;
+ rc = aplic_msi_setup(dev, regs);
else
rc = aplic_direct_setup(dev, regs);
if (rc)
diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
index 4cfbadf37ddc..4393927d8c80 100644
--- a/drivers/irqchip/irq-riscv-aplic-main.h
+++ b/drivers/irqchip/irq-riscv-aplic-main.h
@@ -40,5 +40,13 @@ int aplic_irqdomain_translate(struct irq_fwspec *fwspec, u32 gsi_base,
void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
int aplic_setup_priv(struct aplic_priv *priv, struct device *dev, void __iomem *regs);
int aplic_direct_setup(struct device *dev, void __iomem *regs);
+#ifdef CONFIG_RISCV_APLIC_MSI
+int aplic_msi_setup(struct device *dev, void __iomem *regs);
+#else
+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
+{
+ return -ENODEV;
+}
+#endif

#endif
diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
new file mode 100644
index 000000000000..b2a25e011bb2
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
@@ -0,0 +1,263 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-aplic-main.h"
+
+static void aplic_msi_irq_unmask(struct irq_data *d)
+{
+ aplic_irq_unmask(d);
+ irq_chip_unmask_parent(d);
+}
+
+static void aplic_msi_irq_mask(struct irq_data *d)
+{
+ irq_chip_mask_parent(d);
+ aplic_irq_mask(d);
+}
+
+static void aplic_msi_irq_eoi(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ u32 reg_off, reg_mask;
+
+ /*
+ * EOI handling is required only for level-triggered interrupts
+ * when APLIC is in MSI mode.
+ */
+
+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
+ switch (irqd_get_trigger_type(d)) {
+ case IRQ_TYPE_LEVEL_LOW:
+ /*
+ * If the rectified input value of the source is still low
+ * then set the interrupt pending bit so that interrupt is
+ * re-triggered via MSI.
+ */
+ if (!(readl(priv->regs + reg_off) & reg_mask))
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ /*
+ * If the rectified input value of the source is still high
+ * then set the interrupt pending bit so that interrupt is
+ * re-triggered via MSI.
+ */
+ if (readl(priv->regs + reg_off) & reg_mask)
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ }
+}
+
+static void aplic_msi_write_msg(struct irq_data *d, struct msi_msg *msg)
+{
+ unsigned int group_index, hart_index, guest_index, val;
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_msicfg *mc = &priv->msicfg;
+ phys_addr_t tppn, tbppn, msg_addr;
+ void __iomem *target;
+
+ /* For zeroed MSI, simply write zero into the target register */
+ if (!msg->address_hi && !msg->address_lo && !msg->data) {
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ writel(0, target);
+ return;
+ }
+
+ /* Sanity check on message data */
+ WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+ /* Compute target MSI address */
+ msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+ tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+ /* Compute target HART Base PPN */
+ tbppn = tppn;
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+ WARN_ON(tbppn != mc->base_ppn);
+
+ /* Compute target group and hart indexes */
+ group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+ APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+ hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+ APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+ hart_index |= (group_index << mc->lhxw);
+ WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+ /* Compute target guest index */
+ guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+ /* Update IRQ TARGET register */
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = FIELD_PREP(APLIC_TARGET_HART_IDX, hart_index);
+ val |= FIELD_PREP(APLIC_TARGET_GUEST_IDX, guest_index);
+ val |= FIELD_PREP(APLIC_TARGET_EIID, msg->data);
+ writel(val, target);
+}
+
+static void aplic_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
+{
+ arg->desc = desc;
+ arg->hwirq = (u32)desc->data.icookie.value;
+}
+
+static int aplic_msi_translate(struct irq_domain *d, struct irq_fwspec *fwspec,
+ unsigned long *hwirq, unsigned int *type)
+{
+ struct msi_domain_info *info = d->host_data;
+ struct aplic_priv *priv = info->data;
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static const struct msi_domain_template aplic_msi_template = {
+ .chip = {
+ .name = "APLIC-MSI",
+ .irq_mask = aplic_msi_irq_mask,
+ .irq_unmask = aplic_msi_irq_unmask,
+ .irq_set_type = aplic_irq_set_type,
+ .irq_eoi = aplic_msi_irq_eoi,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+#endif
+ .irq_write_msi_msg = aplic_msi_write_msg,
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+ },
+
+ .ops = {
+ .set_desc = aplic_msi_set_desc,
+ .msi_translate = aplic_msi_translate,
+ },
+
+ .info = {
+ .bus_token = DOMAIN_BUS_WIRED_TO_MSI,
+ .flags = MSI_FLAG_USE_DEV_FWNODE,
+ .handler = handle_fasteoi_irq,
+ },
+};
+
+int aplic_msi_setup(struct device *dev, void __iomem *regs)
+{
+ const struct imsic_global_config *imsic_global;
+ struct aplic_priv *priv;
+ struct aplic_msicfg *mc;
+ phys_addr_t pa;
+ int rc;
+
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ rc = aplic_setup_priv(priv, dev, regs);
+ if (rc) {
+ dev_err(dev, "failed to create APLIC context\n");
+ return rc;
+ }
+ mc = &priv->msicfg;
+
+ /*
+ * The APLIC outgoing MSI config registers assume target MSI
+ * controller to be RISC-V AIA IMSIC controller.
+ */
+ imsic_global = imsic_get_global_config();
+ if (!imsic_global) {
+ dev_err(dev, "IMSIC global config not found\n");
+ return -ENODEV;
+ }
+
+ /* Find number of guest index bits (LHXS) */
+ mc->lhxs = imsic_global->guest_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+ dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
+ return -EINVAL;
+ }
+
+ /* Find number of HART index bits (LHXW) */
+ mc->lhxw = imsic_global->hart_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+ dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find number of group index bits (HHXW) */
+ mc->hhxw = imsic_global->group_index_bits;
+ if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+ dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find first bit position of group index (HHXS) */
+ mc->hhxs = imsic_global->group_index_shift;
+ if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+ dev_err(dev, "IMSIC group index shift should be >= %d\n",
+ (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+ return -EINVAL;
+ }
+ mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+ if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+ dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
+ return -EINVAL;
+ }
+
+ /* Compute PPN base */
+ mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv, true);
+
+ /* Set the APLIC device MSI domain if not available */
+ if (!dev_get_msi_domain(dev)) {
+ /*
+ * The device MSI domain for OF devices is only set at the
+ * time of populating/creating OF device. If the device MSI
+ * domain is discovered later after the OF device is created
+ * then we need to set it explicitly before using any platform
+ * MSI functions.
+ *
+ * In case of APLIC device, the parent MSI domain is always
+ * IMSIC and the IMSIC MSI domains are created later through
+ * the platform driver probing so we set it explicitly here.
+ */
+ if (is_of_node(dev->fwnode))
+ of_msi_configure(dev, to_of_node(dev->fwnode));
+ }
+
+ if (!msi_create_device_irq_domain(dev, MSI_DEFAULT_DOMAIN, &aplic_msi_template,
+ priv->nr_irqs + 1, priv, priv)) {
+ dev_err(dev, "failed to create MSI irq domain\n");
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+ dev_info(dev, "%d interrupts forwared to MSI base %pa\n", priv->nr_irqs, &pa);
+
+ return 0;
+}
--
2.34.1


2024-02-20 06:10:43

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 12/13] RISC-V: Select APLIC and IMSIC drivers

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
arch/riscv/Kconfig | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index bffbd869a068..569f2b6fd60a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -162,6 +162,8 @@ config RISCV
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
select RISCV_ALTERNATIVE if !XIP_KERNEL
+ select RISCV_APLIC
+ select RISCV_IMSIC
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
--
2.34.1


2024-02-20 06:10:56

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 13/13] MAINTAINERS: Add entry for RISC-V AIA drivers

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <[email protected]>
---
MAINTAINERS | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9ed4d3868539..d948f9210f1b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18801,6 +18801,20 @@ S: Maintained
F: drivers/mtd/nand/raw/r852.c
F: drivers/mtd/nand/raw/r852.h

+RISC-V AIA DRIVERS
+M: Anup Patel <[email protected]>
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
+F: drivers/irqchip/irq-riscv-aplic-*.c
+F: drivers/irqchip/irq-riscv-aplic-*.h
+F: drivers/irqchip/irq-riscv-imsic-*.c
+F: drivers/irqchip/irq-riscv-imsic-*.h
+F: drivers/irqchip/irq-riscv-intc.c
+F: include/linux/irqchip/riscv-aplic.h
+F: include/linux/irqchip/riscv-imsic.h
+
RISC-V ARCHITECTURE
M: Paul Walmsley <[email protected]>
M: Palmer Dabbelt <[email protected]>
--
2.34.1


2024-02-20 06:11:58

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

The Linux platform MSI support allows per-device MSI domains so let
us add a platform irqchip driver for RISC-V IMSIC which provides a
base IRQ domain with MSI parent support for platform device domains.

This driver assumes that the IMSIC state is already initialized by
the IMSIC early driver.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Makefile | 2 +-
drivers/irqchip/irq-riscv-imsic-platform.c | 346 +++++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 1 +
3 files changed, 348 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c

diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index d714724387ce..abca445a3229 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,7 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
-obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
new file mode 100644
index 000000000000..7ee44c493dbc
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-platform.c
@@ -0,0 +1,346 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-imsic-state.h"
+
+static bool imsic_cpu_page_phys(unsigned int cpu, unsigned int guest_index,
+ phys_addr_t *out_msi_pa)
+{
+ struct imsic_global_config *global;
+ struct imsic_local_config *local;
+
+ global = &imsic->global;
+ local = per_cpu_ptr(global->local, cpu);
+
+ if (BIT(global->guest_index_bits) <= guest_index)
+ return false;
+
+ if (out_msi_pa)
+ *out_msi_pa = local->msi_pa +
+ (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+ return true;
+}
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+ imsic_vector_mask(irq_data_get_irq_chip_data(d));
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+ imsic_vector_unmask(irq_data_get_irq_chip_data(d));
+}
+
+static int imsic_irq_retrigger(struct irq_data *d)
+{
+ struct imsic_vector *vec = irq_data_get_irq_chip_data(d);
+ struct imsic_local_config *local;
+
+ if (WARN_ON(vec == NULL))
+ return -ENOENT;
+
+ local = per_cpu_ptr(imsic->global.local, vec->cpu);
+ writel_relaxed(vec->local_id, local->msi_va);
+ return 0;
+}
+
+static void imsic_irq_compose_vector_msg(struct imsic_vector *vec, struct msi_msg *msg)
+{
+ phys_addr_t msi_addr;
+
+ if (WARN_ON(vec == NULL))
+ return;
+
+ if (WARN_ON(!imsic_cpu_page_phys(vec->cpu, 0, &msi_addr)))
+ return;
+
+ msg->address_hi = upper_32_bits(msi_addr);
+ msg->address_lo = lower_32_bits(msi_addr);
+ msg->data = vec->local_id;
+}
+
+static void imsic_irq_compose_msg(struct irq_data *d, struct msi_msg *msg)
+{
+ imsic_irq_compose_vector_msg(irq_data_get_irq_chip_data(d), msg);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
+{
+ struct msi_msg msg[2] = { [1] = { }, };
+
+ imsic_irq_compose_vector_msg(vec, msg);
+ irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
+}
+
+static int imsic_irq_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
+ bool force)
+{
+ struct imsic_vector *old_vec, *new_vec;
+ struct irq_data *pd = d->parent_data;
+
+ old_vec = irq_data_get_irq_chip_data(pd);
+ if (WARN_ON(old_vec == NULL))
+ return -ENOENT;
+
+ /* If old vector cpu belongs to the target cpumask then do nothing */
+ if (cpumask_test_cpu(old_vec->cpu, mask_val))
+ return IRQ_SET_MASK_OK_DONE;
+
+ /* If move is already in-flight then return failure */
+ if (imsic_vector_get_move(old_vec))
+ return -EBUSY;
+
+ /* Get a new vector on the desired set of CPUs */
+ new_vec = imsic_vector_alloc(old_vec->hwirq, mask_val);
+ if (!new_vec)
+ return -ENOSPC;
+
+ /* Point device to the new vector */
+ imsic_msi_update_msg(d, new_vec);
+
+ /* Update irq descriptors with the new vector */
+ pd->chip_data = new_vec;
+
+ /* Update effective affinity of parent irq data */
+ irq_data_update_effective_affinity(pd, cpumask_of(new_vec->cpu));
+
+ /* Move state of the old vector to the new vector */
+ imsic_vector_move(old_vec, new_vec);
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+ .name = "IMSIC",
+ .irq_mask = imsic_irq_mask,
+ .irq_unmask = imsic_irq_unmask,
+ .irq_retrigger = imsic_irq_retrigger,
+ .irq_compose_msi_msg = imsic_irq_compose_msg,
+ .flags = IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *args)
+{
+ struct imsic_vector *vec;
+
+ /* Legacy-MSI or multi-MSI not supported yet. */
+ if (nr_irqs > 1)
+ return -ENOTSUPP;
+
+ vec = imsic_vector_alloc(virq, cpu_online_mask);
+ if (!vec)
+ return -ENOSPC;
+
+ irq_domain_set_info(domain, virq, virq,
+ &imsic_irq_base_chip, vec,
+ handle_simple_irq, NULL, NULL);
+ irq_set_noprobe(virq);
+ irq_set_affinity(virq, cpu_online_mask);
+
+ return 0;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+
+ imsic_vector_free(irq_data_get_irq_chip_data(d));
+ irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static int imsic_irq_domain_select(struct irq_domain *domain, struct irq_fwspec *fwspec,
+ enum irq_domain_bus_token bus_token)
+{
+ const struct msi_parent_ops *ops = domain->msi_parent_ops;
+ u32 busmask = BIT(bus_token);
+
+ if (fwspec->fwnode != domain->fwnode || fwspec->param_count != 0)
+ return 0;
+
+ /* Handle pure domain searches */
+ if (bus_token == ops->bus_select_token)
+ return 1;
+
+ return !!(ops->bus_select_mask & busmask);
+}
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+static void imsic_irq_debug_show(struct seq_file *m, struct irq_domain *d,
+ struct irq_data *irqd, int ind)
+{
+ if (!irqd) {
+ imsic_vector_debug_show_summary(m, ind);
+ return;
+ }
+
+ imsic_vector_debug_show(m, irq_data_get_irq_chip_data(irqd), ind);
+}
+#endif
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+ .alloc = imsic_irq_domain_alloc,
+ .free = imsic_irq_domain_free,
+ .select = imsic_irq_domain_select,
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+ .debug_show = imsic_irq_debug_show,
+#endif
+};
+
+static bool imsic_init_dev_msi_info(struct device *dev,
+ struct irq_domain *domain,
+ struct irq_domain *real_parent,
+ struct msi_domain_info *info)
+{
+ const struct msi_parent_ops *pops = real_parent->msi_parent_ops;
+
+ /* MSI parent domain specific settings */
+ switch (real_parent->bus_token) {
+ case DOMAIN_BUS_NEXUS:
+ if (WARN_ON_ONCE(domain != real_parent))
+ return false;
+#ifdef CONFIG_SMP
+ info->chip->irq_set_affinity = imsic_irq_set_affinity;
+#endif
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return false;
+ }
+
+ /* Is the target supported? */
+ switch (info->bus_token) {
+ case DOMAIN_BUS_DEVICE_MSI:
+ /*
+ * Per-device MSI should never have any MSI feature bits
+ * set. It's sole purpose is to create a dumb interrupt
+ * chip which has a device specific irq_write_msi_msg()
+ * callback.
+ */
+ if (WARN_ON_ONCE(info->flags))
+ return false;
+
+ /* Core managed MSI descriptors */
+ info->flags |= MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS |
+ MSI_FLAG_FREE_MSI_DESCS;
+ break;
+ case DOMAIN_BUS_WIRED_TO_MSI:
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return false;
+ }
+
+ /* Use hierarchial chip operations re-trigger */
+ info->chip->irq_retrigger = irq_chip_retrigger_hierarchy;
+
+ /*
+ * Mask out the domain specific MSI feature flags which are not
+ * supported by the real parent.
+ */
+ info->flags &= pops->supported_flags;
+
+ /* Enforce the required flags */
+ info->flags |= pops->required_flags;
+
+ return true;
+}
+
+#define MATCH_PLATFORM_MSI BIT(DOMAIN_BUS_PLATFORM_MSI)
+
+static const struct msi_parent_ops imsic_msi_parent_ops = {
+ .supported_flags = MSI_GENERIC_FLAGS_MASK,
+ .required_flags = MSI_FLAG_USE_DEF_DOM_OPS |
+ MSI_FLAG_USE_DEF_CHIP_OPS,
+ .bus_select_token = DOMAIN_BUS_NEXUS,
+ .bus_select_mask = MATCH_PLATFORM_MSI,
+ .init_dev_msi_info = imsic_init_dev_msi_info,
+};
+
+int imsic_irqdomain_init(void)
+{
+ struct imsic_global_config *global;
+
+ if (!imsic || !imsic->fwnode) {
+ pr_err("early driver not probed\n");
+ return -ENODEV;
+ }
+
+ if (imsic->base_domain) {
+ pr_err("%pfwP: irq domain already created\n", imsic->fwnode);
+ return -ENODEV;
+ }
+
+ /* Create Base IRQ domain */
+ imsic->base_domain = irq_domain_create_tree(imsic->fwnode,
+ &imsic_base_domain_ops, imsic);
+ if (!imsic->base_domain) {
+ pr_err("%pfwP: failed to create IMSIC base domain\n",
+ imsic->fwnode);
+ return -ENOMEM;
+ }
+ imsic->base_domain->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
+ imsic->base_domain->msi_parent_ops = &imsic_msi_parent_ops;
+
+ irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);
+
+ global = &imsic->global;
+ pr_info("%pfwP: hart-index-bits: %d, guest-index-bits: %d\n",
+ imsic->fwnode, global->hart_index_bits, global->guest_index_bits);
+ pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+ imsic->fwnode, global->group_index_bits, global->group_index_shift);
+ pr_info("%pfwP: per-CPU IDs %d at base PPN %pa\n",
+ imsic->fwnode, global->nr_ids, &global->base_addr);
+ pr_info("%pfwP: total %d interrupts available\n",
+ imsic->fwnode, num_possible_cpus() * (global->nr_ids - 1));
+
+ return 0;
+}
+
+static int imsic_platform_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+
+ if (imsic && imsic->fwnode != dev->fwnode) {
+ dev_err(dev, "fwnode mismatch\n");
+ return -ENODEV;
+ }
+
+ return imsic_irqdomain_init();
+}
+
+static const struct of_device_id imsic_platform_match[] = {
+ { .compatible = "riscv,imsics" },
+ {}
+};
+
+static struct platform_driver imsic_platform_driver = {
+ .driver = {
+ .name = "riscv-imsic",
+ .of_match_table = imsic_platform_match,
+ },
+ .probe = imsic_platform_probe,
+};
+builtin_platform_driver(imsic_platform_driver);
diff --git a/drivers/irqchip/irq-riscv-imsic-state.h b/drivers/irqchip/irq-riscv-imsic-state.h
index f0c983db99eb..a01a70eff3af 100644
--- a/drivers/irqchip/irq-riscv-imsic-state.h
+++ b/drivers/irqchip/irq-riscv-imsic-state.h
@@ -94,5 +94,6 @@ void imsic_vector_debug_show_summary(struct seq_file *m, int ind);
void imsic_state_online(void);
void imsic_state_offline(void);
int imsic_setup_state(struct fwnode_handle *fwnode);
+int imsic_irqdomain_init(void);

#endif
--
2.34.1


2024-02-20 06:12:28

by Anup Patel

[permalink] [raw]
Subject: [PATCH v13 10/13] irqchip: Add RISC-V advanced PLIC driver for direct-mode

The RISC-V advanced interrupt architecture (AIA) specification defines
advanced platform-level interrupt controller (APLIC) which has two modes
of operation: 1) Direct mode and 2) MSI mode.
(For more details, refer https://github.com/riscv/riscv-aia)

In APLIC direct-mode, wired interrupts are forwared to CPUs (or HARTs)
as a local external interrupt.

We add a platform irqchip driver for the RISC-V APLIC direct-mode to
support RISC-V platforms having only wired interrupts.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 5 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic-direct.c | 324 +++++++++++++++++++++++
drivers/irqchip/irq-riscv-aplic-main.c | 211 +++++++++++++++
drivers/irqchip/irq-riscv-aplic-main.h | 44 +++
include/linux/irqchip/riscv-aplic.h | 145 ++++++++++
6 files changed, 730 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-aplic-direct.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.h
create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 2fc0cb32341a..dbc8811d3764 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -546,6 +546,11 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_APLIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index abca445a3229..7f8289790ed8 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic-direct.c b/drivers/irqchip/irq-riscv-aplic-direct.c
new file mode 100644
index 000000000000..2430d6b81026
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-direct.c
@@ -0,0 +1,324 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/printk.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-aplic-main.h"
+
+#define APLIC_DISABLE_IDELIVERY 0
+#define APLIC_ENABLE_IDELIVERY 1
+#define APLIC_DISABLE_ITHRESHOLD 1
+#define APLIC_ENABLE_ITHRESHOLD 0
+
+struct aplic_direct {
+ struct aplic_priv priv;
+ struct irq_domain *irqdomain;
+ struct cpumask lmask;
+};
+
+struct aplic_idc {
+ unsigned int hart_index;
+ void __iomem *regs;
+ struct aplic_direct *direct;
+};
+
+static unsigned int aplic_direct_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_direct_irq_eoi(struct irq_data *d)
+{
+ /*
+ * The fasteoi_handler requires irq_eoi() callback hence
+ * provide a dummy handler.
+ */
+}
+
+#ifdef CONFIG_SMP
+static int aplic_direct_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
+ bool force)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_direct *direct = container_of(priv, struct aplic_direct, priv);
+ struct aplic_idc *idc;
+ unsigned int cpu, val;
+ struct cpumask amask;
+ void __iomem *target;
+
+ cpumask_and(&amask, &direct->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ target = priv->regs + APLIC_TARGET_BASE + (d->hwirq - 1) * sizeof(u32);
+ val = FIELD_PREP(APLIC_TARGET_HART_IDX, idc->hart_index);
+ val |= FIELD_PREP(APLIC_TARGET_IPRIO, APLIC_DEFAULT_PRIORITY);
+ writel(val, target);
+
+ irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_direct_chip = {
+ .name = "APLIC-DIRECT",
+ .irq_mask = aplic_irq_mask,
+ .irq_unmask = aplic_irq_unmask,
+ .irq_set_type = aplic_irq_set_type,
+ .irq_eoi = aplic_direct_irq_eoi,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = aplic_direct_set_affinity,
+#endif
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_direct_irqdomain_translate(struct irq_domain *d, struct irq_fwspec *fwspec,
+ unsigned long *hwirq, unsigned int *type)
+{
+ struct aplic_priv *priv = d->host_data;
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_direct_irqdomain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *arg)
+{
+ struct aplic_priv *priv = domain->host_data;
+ struct aplic_direct *direct = container_of(priv, struct aplic_direct, priv);
+ struct irq_fwspec *fwspec = arg;
+ irq_hw_number_t hwirq;
+ unsigned int type;
+ int i, ret;
+
+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i, &aplic_direct_chip,
+ priv, handle_fasteoi_irq, NULL, NULL);
+ irq_set_affinity(virq + i, &direct->lmask);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_direct_irqdomain_ops = {
+ .translate = aplic_direct_irqdomain_translate,
+ .alloc = aplic_direct_irqdomain_alloc,
+ .free = irq_domain_free_irqs_top,
+};
+
+/*
+ * To handle an APLIC direct interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_direct_handle_irq(struct irq_desc *desc)
+{
+ struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+ struct irq_domain *irqdomain = idc->direct->irqdomain;
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ irq_hw_number_t hw_irq;
+ int irq;
+
+ chained_irq_enter(chip, desc);
+
+ while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+ hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+ irq = irq_find_mapping(irqdomain, hw_irq);
+
+ if (unlikely(irq <= 0))
+ dev_warn_ratelimited(idc->direct->priv.dev,
+ "hw_irq %lu mapping not found\n", hw_irq);
+ else
+ generic_handle_irq(irq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+ u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+ u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+ /* Priority must be less than threshold for interrupt triggering */
+ writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+ /* Delivery must be set to 1 for interrupt triggering */
+ writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_direct_dying_cpu(unsigned int cpu)
+{
+ if (aplic_direct_parent_irq)
+ disable_percpu_irq(aplic_direct_parent_irq);
+
+ return 0;
+}
+
+static int aplic_direct_starting_cpu(unsigned int cpu)
+{
+ if (aplic_direct_parent_irq)
+ enable_percpu_irq(aplic_direct_parent_irq,
+ irq_get_trigger_type(aplic_direct_parent_irq));
+
+ return 0;
+}
+
+static int aplic_direct_parse_parent_hwirq(struct device *dev, u32 index,
+ u32 *parent_hwirq, unsigned long *parent_hartid)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(dev->fwnode), index, &parent);
+ if (rc)
+ return rc;
+
+ rc = riscv_of_parent_hartid(parent.np, parent_hartid);
+ if (rc)
+ return rc;
+
+ *parent_hwirq = parent.args[0];
+ return 0;
+}
+
+int aplic_direct_setup(struct device *dev, void __iomem *regs)
+{
+ int i, j, rc, cpu, current_cpu, setup_count = 0;
+ struct aplic_direct *direct;
+ struct irq_domain *domain;
+ struct aplic_priv *priv;
+ struct aplic_idc *idc;
+ unsigned long hartid;
+ u32 v, hwirq;
+
+ direct = devm_kzalloc(dev, sizeof(*direct), GFP_KERNEL);
+ if (!direct)
+ return -ENOMEM;
+ priv = &direct->priv;
+
+ rc = aplic_setup_priv(priv, dev, regs);
+ if (rc) {
+ dev_err(dev, "failed to create APLIC context\n");
+ return rc;
+ }
+
+ /* Setup per-CPU IDC and target CPU mask */
+ current_cpu = get_cpu();
+ for (i = 0; i < priv->nr_idcs; i++) {
+ rc = aplic_direct_parse_parent_hwirq(dev, i, &hwirq, &hartid);
+ if (rc) {
+ dev_warn(dev, "parent irq for IDC%d not found\n", i);
+ continue;
+ }
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (hwirq != RV_IRQ_EXT)
+ continue;
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ dev_warn(dev, "invalid cpuid for IDC%d\n", i);
+ continue;
+ }
+
+ cpumask_set_cpu(cpu, &direct->lmask);
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ idc->hart_index = i;
+ idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+ idc->direct = direct;
+
+ aplic_idc_set_delivery(idc, true);
+
+ /*
+ * Boot cpu might not have APLIC hart_index = 0 so check
+ * and update target registers of all interrupts.
+ */
+ if (cpu == current_cpu && idc->hart_index) {
+ v = FIELD_PREP(APLIC_TARGET_HART_IDX, idc->hart_index);
+ v |= FIELD_PREP(APLIC_TARGET_IPRIO, APLIC_DEFAULT_PRIORITY);
+ for (j = 1; j <= priv->nr_irqs; j++)
+ writel(v, priv->regs + APLIC_TARGET_BASE + (j - 1) * sizeof(u32));
+ }
+
+ setup_count++;
+ }
+ put_cpu();
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!aplic_direct_parent_irq && domain) {
+ aplic_direct_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (aplic_direct_parent_irq) {
+ irq_set_chained_handler(aplic_direct_parent_irq,
+ aplic_direct_handle_irq);
+
+ /*
+ * Setup CPUHP notifier to enable parent
+ * interrupt on all CPUs
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/aplic:starting",
+ aplic_direct_starting_cpu,
+ aplic_direct_dying_cpu);
+ }
+ }
+
+ /* Fail if we were not able to setup IDC for any CPU */
+ if (!setup_count)
+ return -ENODEV;
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv, false);
+
+ /* Create irq domain instance for the APLIC */
+ direct->irqdomain = irq_domain_create_linear(dev->fwnode, priv->nr_irqs + 1,
+ &aplic_direct_irqdomain_ops, priv);
+ if (!direct->irqdomain) {
+ dev_err(dev, "failed to create direct irq domain\n");
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
+ priv->nr_irqs, priv->nr_idcs);
+
+ return 0;
+}
diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
new file mode 100644
index 000000000000..e6617147daff
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-main.c
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+
+#include "irq-riscv-aplic-main.h"
+
+void aplic_irq_unmask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+}
+
+void aplic_irq_mask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+}
+
+int aplic_irq_set_type(struct irq_data *d, unsigned int type)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ void __iomem *sourcecfg;
+ u32 val = 0;
+
+ switch (type) {
+ case IRQ_TYPE_NONE:
+ val = APLIC_SOURCECFG_SM_INACTIVE;
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ val = APLIC_SOURCECFG_SM_EDGE_FALL;
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ val = APLIC_SOURCECFG_SM_EDGE_RISE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+ sourcecfg += (d->hwirq - 1) * sizeof(u32);
+ writel(val, sourcecfg);
+
+ return 0;
+}
+
+int aplic_irqdomain_translate(struct irq_fwspec *fwspec, u32 gsi_base,
+ unsigned long *hwirq, unsigned int *type)
+{
+ if (WARN_ON(fwspec->param_count < 2))
+ return -EINVAL;
+ if (WARN_ON(!fwspec->param[0]))
+ return -EINVAL;
+
+ /* For DT, gsi_base is always zero. */
+ *hwirq = fwspec->param[0] - gsi_base;
+ *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+ WARN_ON(*type == IRQ_TYPE_NONE);
+
+ return 0;
+}
+
+void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode)
+{
+ u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+ u32 valH;
+
+ if (msi_mode) {
+ val = lower_32_bits(priv->msicfg.base_ppn);
+ valH = FIELD_PREP(APLIC_xMSICFGADDRH_BAPPN, upper_32_bits(priv->msicfg.base_ppn));
+ valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXW, priv->msicfg.lhxw);
+ valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXW, priv->msicfg.hhxw);
+ valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXS, priv->msicfg.lhxs);
+ valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXS, priv->msicfg.hhxs);
+ writel(val, priv->regs + APLIC_xMSICFGADDR);
+ writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+ }
+#endif
+
+ /* Setup APLIC domaincfg register */
+ val = readl(priv->regs + APLIC_DOMAINCFG);
+ val |= APLIC_DOMAINCFG_IE;
+ if (msi_mode)
+ val |= APLIC_DOMAINCFG_DM;
+ writel(val, priv->regs + APLIC_DOMAINCFG);
+ if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+ dev_warn(priv->dev, "unable to write 0x%x in domaincfg\n", val);
+}
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+ int i;
+
+ /* Disable all interrupts */
+ for (i = 0; i <= priv->nr_irqs; i += 32)
+ writel(-1U, priv->regs + APLIC_CLRIE_BASE + (i / 32) * sizeof(u32));
+
+ /* Set interrupt type and default priority for all interrupts */
+ for (i = 1; i <= priv->nr_irqs; i++) {
+ writel(0, priv->regs + APLIC_SOURCECFG_BASE + (i - 1) * sizeof(u32));
+ writel(APLIC_DEFAULT_PRIORITY,
+ priv->regs + APLIC_TARGET_BASE + (i - 1) * sizeof(u32));
+ }
+
+ /* Clear APLIC domaincfg */
+ writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+int aplic_setup_priv(struct aplic_priv *priv, struct device *dev, void __iomem *regs)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ /* Save device pointer and register base */
+ priv->dev = dev;
+ priv->regs = regs;
+
+ /* Find out number of interrupt sources */
+ rc = of_property_read_u32(to_of_node(dev->fwnode), "riscv,num-sources",
+ &priv->nr_irqs);
+ if (rc) {
+ dev_err(dev, "failed to get number of interrupt sources\n");
+ return rc;
+ }
+
+ /*
+ * Find out number of IDCs based on parent interrupts
+ *
+ * If "msi-parent" property is present then we ignore the
+ * APLIC IDCs which forces the APLIC driver to use MSI mode.
+ */
+ if (!of_property_present(to_of_node(dev->fwnode), "msi-parent")) {
+ while (!of_irq_parse_one(to_of_node(dev->fwnode), priv->nr_idcs, &parent))
+ priv->nr_idcs++;
+ }
+
+ /* Setup initial state APLIC interrupts */
+ aplic_init_hw_irqs(priv);
+
+ return 0;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ bool msi_mode = false;
+ void __iomem *regs;
+ int rc;
+
+ /* Map the MMIO registers */
+ regs = devm_platform_ioremap_resource(pdev, 0);
+ if (!regs) {
+ dev_err(dev, "failed map MMIO registers\n");
+ return -ENOMEM;
+ }
+
+ /*
+ * If msi-parent property is present then setup APLIC MSI
+ * mode otherwise setup APLIC direct mode.
+ */
+ if (is_of_node(dev->fwnode))
+ msi_mode = of_property_present(to_of_node(dev->fwnode), "msi-parent");
+ if (msi_mode)
+ rc = -ENODEV;
+ else
+ rc = aplic_direct_setup(dev, regs);
+ if (rc)
+ dev_err(dev, "failed to setup APLIC in %s mode\n", msi_mode ? "MSI" : "direct");
+
+ return rc;
+}
+
+static const struct of_device_id aplic_match[] = {
+ { .compatible = "riscv,aplic" },
+ {}
+};
+
+static struct platform_driver aplic_driver = {
+ .driver = {
+ .name = "riscv-aplic",
+ .of_match_table = aplic_match,
+ },
+ .probe = aplic_probe,
+};
+builtin_platform_driver(aplic_driver);
diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
new file mode 100644
index 000000000000..4cfbadf37ddc
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-main.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#ifndef _IRQ_RISCV_APLIC_MAIN_H
+#define _IRQ_RISCV_APLIC_MAIN_H
+
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
+#include <linux/fwnode.h>
+
+#define APLIC_DEFAULT_PRIORITY 1
+
+struct aplic_msicfg {
+ phys_addr_t base_ppn;
+ u32 hhxs;
+ u32 hhxw;
+ u32 lhxs;
+ u32 lhxw;
+};
+
+struct aplic_priv {
+ struct device *dev;
+ u32 gsi_base;
+ u32 nr_irqs;
+ u32 nr_idcs;
+ void __iomem *regs;
+ struct aplic_msicfg msicfg;
+};
+
+void aplic_irq_unmask(struct irq_data *d);
+void aplic_irq_mask(struct irq_data *d);
+int aplic_irq_set_type(struct irq_data *d, unsigned int type);
+int aplic_irqdomain_translate(struct irq_fwspec *fwspec, u32 gsi_base,
+ unsigned long *hwirq, unsigned int *type);
+void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
+int aplic_setup_priv(struct aplic_priv *priv, struct device *dev, void __iomem *regs);
+int aplic_direct_setup(struct device *dev, void __iomem *regs);
+
+#endif
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..ec8f7df50583
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,145 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC BIT(14)
+#define APLIC_MAX_SOURCE 1024
+
+#define APLIC_DOMAINCFG 0x0000
+#define APLIC_DOMAINCFG_RDONLY 0x80000000
+#define APLIC_DOMAINCFG_IE BIT(8)
+#define APLIC_DOMAINCFG_DM BIT(2)
+#define APLIC_DOMAINCFG_BE BIT(0)
+
+#define APLIC_SOURCECFG_BASE 0x0004
+#define APLIC_SOURCECFG_D BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
+#define APLIC_SOURCECFG_SM_MASK 0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE 0x0
+#define APLIC_SOURCECFG_SM_DETACH 0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
+
+#define APLIC_MMSICFGADDR 0x1bc0
+#define APLIC_MMSICFGADDRH 0x1bc4
+#define APLIC_SMSICFGADDR 0x1bc8
+#define APLIC_SMSICFGADDRH 0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
+#define APLIC_xMSICFGADDRH_HHXS (APLIC_xMSICFGADDRH_HHXS_MASK << \
+ APLIC_xMSICFGADDRH_HHXS_SHIFT)
+#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
+#define APLIC_xMSICFGADDRH_LHXS (APLIC_xMSICFGADDRH_LHXS_MASK << \
+ APLIC_xMSICFGADDRH_LHXS_SHIFT)
+#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
+#define APLIC_xMSICFGADDRH_HHXW (APLIC_xMSICFGADDRH_HHXW_MASK << \
+ APLIC_xMSICFGADDRH_HHXW_SHIFT)
+#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
+#define APLIC_xMSICFGADDRH_LHXW (APLIC_xMSICFGADDRH_LHXW_MASK << \
+ APLIC_xMSICFGADDRH_LHXW_SHIFT)
+#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
+#define APLIC_xMSICFGADDRH_BAPPN_SHIFT 0
+#define APLIC_xMSICFGADDRH_BAPPN (APLIC_xMSICFGADDRH_BAPPN_MASK << \
+ APLIC_xMSICFGADDRH_BAPPN_SHIFT)
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT 12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+ (BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+ (BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+ ((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+ (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+ APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+ (BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+ ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+ (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+ APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_IRQBITS_PER_REG 32
+
+#define APLIC_SETIP_BASE 0x1c00
+#define APLIC_SETIPNUM 0x1cdc
+
+#define APLIC_CLRIP_BASE 0x1d00
+#define APLIC_CLRIPNUM 0x1ddc
+
+#define APLIC_SETIE_BASE 0x1e00
+#define APLIC_SETIENUM 0x1edc
+
+#define APLIC_CLRIE_BASE 0x1f00
+#define APLIC_CLRIENUM 0x1fdc
+
+#define APLIC_SETIPNUM_LE 0x2000
+#define APLIC_SETIPNUM_BE 0x2004
+
+#define APLIC_GENMSI 0x3000
+
+#define APLIC_TARGET_BASE 0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT 18
+#define APLIC_TARGET_HART_IDX_MASK 0x3fff
+#define APLIC_TARGET_HART_IDX (APLIC_TARGET_HART_IDX_MASK << \
+ APLIC_TARGET_HART_IDX_SHIFT)
+#define APLIC_TARGET_GUEST_IDX_SHIFT 12
+#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
+#define APLIC_TARGET_GUEST_IDX (APLIC_TARGET_GUEST_IDX_MASK << \
+ APLIC_TARGET_GUEST_IDX_SHIFT)
+#define APLIC_TARGET_IPRIO_SHIFT 0
+#define APLIC_TARGET_IPRIO_MASK 0xff
+#define APLIC_TARGET_IPRIO (APLIC_TARGET_IPRIO_MASK << \
+ APLIC_TARGET_IPRIO_SHIFT)
+#define APLIC_TARGET_EIID_SHIFT 0
+#define APLIC_TARGET_EIID_MASK 0x7ff
+#define APLIC_TARGET_EIID (APLIC_TARGET_EIID_MASK << \
+ APLIC_TARGET_EIID_SHIFT)
+
+#define APLIC_IDC_BASE 0x4000
+#define APLIC_IDC_SIZE 32
+
+#define APLIC_IDC_IDELIVERY 0x00
+
+#define APLIC_IDC_IFORCE 0x04
+
+#define APLIC_IDC_ITHRESHOLD 0x08
+
+#define APLIC_IDC_TOPI 0x18
+#define APLIC_IDC_TOPI_ID_SHIFT 16
+#define APLIC_IDC_TOPI_ID_MASK 0x3ff
+#define APLIC_IDC_TOPI_ID (APLIC_IDC_TOPI_ID_MASK << \
+ APLIC_IDC_TOPI_ID_SHIFT)
+#define APLIC_IDC_TOPI_PRIO_SHIFT 0
+#define APLIC_IDC_TOPI_PRIO_MASK 0xff
+#define APLIC_IDC_TOPI_PRIO (APLIC_IDC_TOPI_PRIO_MASK << \
+ APLIC_IDC_TOPI_PRIO_SHIFT)
+
+#define APLIC_IDC_CLAIMI 0x1c
+
+#endif
--
2.34.1


2024-02-20 10:10:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 01/13] irqchip/sifive-plic: Convert PLIC driver into a platform driver

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> The PLIC driver does not require very early initialization so let
> us convert it into a platform driver.

s/let us convert/convert/

Please us passive voice and imperative mood all over the changelogs. No
we/us, let....

> As part of the conversion, the PLIC probing undergoes the following
> changes:
> 1. Use dev_info(), dev_err() and dev_warn() instead of pr_info(),
> pr_err() and pr_warn()
> 2. Use devm_xyz() APIs wherever applicable
> 3. PLIC is now probed after CPUs are brought-up so we have to
> setup cpuhp state after context handler of all online CPUs
> are initialized otherwise we see crash on multi-socket systems

This patch is really doing too many things at once, which makes it hard
to review. Can you split this into digestable pieces please?

> if (unlikely(err))
> - pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
> + dev_warn_ratelimited(handler->priv->dev,
> + "can't find mapping for hwirq %lu\n",
> hwirq);

Nit. Please use brackets around the condition. See:

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#bracket-rules

for reasoning.

Thanks,

tglx

2024-02-20 10:38:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 02/13] irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> Now that PLIC driver is probed as a regular platform driver, the lock
> dependency validator complains about the safety of handler->enable_lock
> usage:
>
> [ 0.956775] Possible interrupt unsafe locking scenario:
>
> [ 0.956998] CPU0 CPU1
> [ 0.957247] ---- ----
> [ 0.957439] lock(&handler->enable_lock);
> [ 0.957607] local_irq_disable();
> [ 0.957793] lock(&irq_desc_lock_class);
> [ 0.958021] lock(&handler->enable_lock);
> [ 0.958246] <Interrupt>
> [ 0.958342] lock(&irq_desc_lock_class);
> [ 0.958501]
> *** DEADLOCK ***
>
> To address above, let's use raw_spin_lock_irqsave/unlock_irqrestore()
> instead of raw_spin_lock/unlock().

s/let's//

2024-02-20 10:42:12

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 03/13] irqchip/riscv-intc: Add support for RISC-V AIA

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:

> The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> local interrupts in following ways:
> 1. Minimum 64 local interrupts for both RV32 and RV64
> 2. Ability to process multiple pending local interrupts in same
> interrupt handler
> 3. Priority configuration for each local interrupts
> 4. Special CSRs to configure/access the per-HART MSI controller
>
> We add support for #1 and #2 described above in the RISC-V intc
> driver.

S/We add/Add/

> +static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
> +{
> + unsigned long topi;
> +
> + while ((topi = csr_read(CSR_TOPI)))
> + generic_handle_domain_irq(intc_domain,
> + topi >> TOPI_IID_SHIFT);

Please let it stick out. You got 100 characters. All over the place.

Thanks,

tglx

2024-02-20 11:53:08

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

Anup,

This version is so much easier to follow! Thanks a lot for then
cleanups/design changes.

A bunch of nits, and a major one, below.

Anup Patel <[email protected]> writes:

> The RISC-V advanced interrupt architecture (AIA) specification
> defines a new MSI controller called incoming message signalled
> interrupt controller (IMSIC) which manages MSI on per-HART (or
> per-CPU) basis. It also supports IPIs as software injected MSIs.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> Let us add an early irqchip driver for RISC-V IMSIC which sets
> up the IMSIC state and provide IPIs.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Kconfig | 7 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-imsic-early.c | 213 ++++++
> drivers/irqchip/irq-riscv-imsic-state.c | 906 ++++++++++++++++++++++++
> drivers/irqchip/irq-riscv-imsic-state.h | 98 +++
> include/linux/irqchip/riscv-imsic.h | 87 +++
> 6 files changed, 1312 insertions(+)
> create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
> create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
> create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
> create mode 100644 include/linux/irqchip/riscv-imsic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index f7149d0f3d45..85f86e31c996 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -546,6 +546,13 @@ config SIFIVE_PLIC
> select IRQ_DOMAIN_HIERARCHY
> select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_IMSIC
> + bool
> + depends on RISCV
> + select IRQ_DOMAIN_HIERARCHY
> + select GENERIC_IRQ_MATRIX_ALLOCATOR
> + select GENERIC_MSI_IRQ
> +
> config EXYNOS_IRQ_COMBINER
> bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index ffd945fe71aa..d714724387ce 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
> new file mode 100644
> index 000000000000..32fe428b1c19
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic-early.c
> @@ -0,0 +1,213 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/module.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +
> +#include "irq-riscv-imsic-state.h"
> +
> +static int imsic_parent_irq;
> +
> +#ifdef CONFIG_SMP
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> + struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
> +
> + writel_relaxed(IMSIC_IPI_ID, local->msi_va);
> +}
> +
> +static void imsic_ipi_starting_cpu(void)
> +{
> + /* Enable IPIs for current CPU. */
> + __imsic_id_set_enable(IMSIC_IPI_ID);
> +}
> +
> +static void imsic_ipi_dying_cpu(void)
> +{
> + /* Disable IPIs for current CPU. */
> + __imsic_id_clear_enable(IMSIC_IPI_ID);
> +}
> +
> +static int __init imsic_ipi_domain_init(void)
> +{
> + int virq;
> +
> + /* Create IMSIC IPI multiplexing */
> + virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
> + if (virq <= 0)
> + return (virq < 0) ? virq : -ENOMEM;

Nit: No parenthesis need to clutter.

> +
> + /* Set vIRQ range */
> + riscv_ipi_set_virq_range(virq, IMSIC_NR_IPI, true);
> +
> + /* Announce that IMSIC is providing IPIs */
> + pr_info("%pfwP: providing IPIs using interrupt %d\n", imsic->fwnode, IMSIC_IPI_ID);
> +
> + return 0;
> +}
> +#else
> +static void imsic_ipi_starting_cpu(void)
> +{
> +}
> +
> +static void imsic_ipi_dying_cpu(void)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(void)
> +{
> + return 0;
> +}
> +#endif
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + int err, cpu = smp_processor_id();
> + struct imsic_vector *vec;
> + unsigned long local_id;
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> + local_id = local_id >> TOPEI_ID_SHIFT;

Nit: Wdyt about moving shift into the loop predicate, or using >>=?

> +
> + if (local_id == IMSIC_IPI_ID) {
> +#ifdef CONFIG_SMP
> + ipi_mux_process();
> +#endif

Is IMSIC_IPI_ID a thing on !IS_ENABLED(CONFIG_SMP)?

> + continue;
> + }
> +
> + if (unlikely(!imsic->base_domain))
> + continue;
> +
> + vec = imsic_vector_from_local_id(cpu, local_id);
> + if (!vec) {
> + pr_warn_ratelimited("vector not found for local ID 0x%lx\n", local_id);
> + continue;
> + }
> +
> + err = generic_handle_domain_irq(imsic->base_domain,
> + vec->hwirq);

Nit: 100 chars

> + if (unlikely(err))
> + pr_warn_ratelimited("hwirq 0x%x mapping not found\n", vec->hwirq);
> + }
> +
> + chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> + /* Mark per-CPU IMSIC state as online */
> + imsic_state_online();
> +
> + /* Enable per-CPU parent interrupt */
> + enable_percpu_irq(imsic_parent_irq, irq_get_trigger_type(imsic_parent_irq));
> +
> + /* Setup IPIs */
> + imsic_ipi_starting_cpu();
> +
> + /*
> + * Interrupts identities might have been enabled/disabled while
> + * this CPU was not running so sync-up local enable/disable state.
> + */
> + imsic_local_sync_all();
> +
> + /* Enable local interrupt delivery */
> + imsic_local_delivery(true);
> +
> + return 0;
> +}
> +
> +static int imsic_dying_cpu(unsigned int cpu)
> +{
> + /* Cleanup IPIs */
> + imsic_ipi_dying_cpu();
> +
> + /* Mark per-CPU IMSIC state as offline */
> + imsic_state_offline();
> +
> + return 0;
> +}
> +
> +static int __init imsic_early_probe(struct fwnode_handle *fwnode)
> +{
> + struct irq_domain *domain;
> + int rc;
> +
> + /* Find parent domain and register chained handler */
> + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
> + if (!domain) {
> + pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> + return -ENOENT;
> + }
> + imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> + if (!imsic_parent_irq) {
> + pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> + return -ENOENT;
> + }
> +
> + /* Initialize IPI domain */
> + rc = imsic_ipi_domain_init();
> + if (rc) {
> + pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> + return rc;
> + }
> +
> + /* Setup chained handler to the parent domain interrupt */
> + irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> +
> + /*
> + * Setup cpuhp state (must be done after setting imsic_parent_irq)
> + *
> + * Don't disable per-CPU IMSIC file when CPU goes offline
> + * because this affects IPI and the masking/unmasking of
> + * virtual IPIs is done via generic IPI-Mux
> + */
> + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "irqchip/riscv/imsic:starting",
> + imsic_starting_cpu, imsic_dying_cpu);
> +
> + return 0;
> +}
> +
> +static int __init imsic_early_dt_init(struct device_node *node,
> + struct device_node *parent)
> +{
> + struct fwnode_handle *fwnode = &node->fwnode;
> + int rc;
> +
> + /* Setup IMSIC state */
> + rc = imsic_setup_state(fwnode);
> + if (rc) {
> + pr_err("%pfwP: failed to setup state (error %d)\n",
> + fwnode, rc);

Nit. 100 chars

> + return rc;
> + }
> +
> + /* Do early setup of IPIs */
> + rc = imsic_early_probe(fwnode);
> + if (rc)
> + return rc;
> +
> + /* Ensure that OF platform device gets probed */
> + of_node_clear_flag(node, OF_POPULATED);
> + return 0;
> +}
> +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
> diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
> new file mode 100644
> index 000000000000..4f347486ec7c
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic-state.c
> @@ -0,0 +1,906 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/cpu.h>
> +#include <linux/bitmap.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/seq_file.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#include "irq-riscv-imsic-state.h"
> +
> +#define IMSIC_DISABLE_EIDELIVERY 0
> +#define IMSIC_ENABLE_EIDELIVERY 1
> +#define IMSIC_DISABLE_EITHRESHOLD 1
> +#define IMSIC_ENABLE_EITHRESHOLD 0
> +
> +static inline void imsic_csr_write(unsigned long reg, unsigned long val)
> +{
> + csr_write(CSR_ISELECT, reg);
> + csr_write(CSR_IREG, val);
> +}
> +
> +static inline unsigned long imsic_csr_read(unsigned long reg)
> +{
> + csr_write(CSR_ISELECT, reg);
> + return csr_read(CSR_IREG);
> +}
> +
> +static inline unsigned long imsic_csr_read_clear(unsigned long reg, unsigned long val)
> +{
> + csr_write(CSR_ISELECT, reg);
> + return csr_read_clear(CSR_IREG, val);
> +}
> +
> +static inline void imsic_csr_set(unsigned long reg, unsigned long val)
> +{
> + csr_write(CSR_ISELECT, reg);
> + csr_set(CSR_IREG, val);
> +}
> +
> +static inline void imsic_csr_clear(unsigned long reg, unsigned long val)
> +{
> + csr_write(CSR_ISELECT, reg);
> + csr_clear(CSR_IREG, val);
> +}
> +
> +struct imsic_priv *imsic;
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> + return imsic ? &imsic->global : NULL;
> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +static bool __imsic_eix_read_clear(unsigned long id, bool pend)
> +{
> + unsigned long isel, imask;
> +
> + isel = id / BITS_PER_LONG;
> + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> + isel += pend ? IMSIC_EIP0 : IMSIC_EIE0;
> + imask = BIT(id & (__riscv_xlen - 1));
> +
> + return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;

Nit: use return !!(imsic_csr_read_clear(isel, imask) & imask)

> +}
> +
> +static inline bool __imsic_id_read_clear_enabled(unsigned long id)
> +{
> + return __imsic_eix_read_clear(id, false);
> +}
> +
> +static inline bool __imsic_id_read_clear_pending(unsigned long id)
> +{
> + return __imsic_eix_read_clear(id, true);
> +}
> +
> +void __imsic_eix_update(unsigned long base_id, unsigned long num_id, bool pend, bool val)
> +{
> + unsigned long id = base_id, last_id = base_id + num_id;
> + unsigned long i, isel, ireg;
> +
> + while (id < last_id) {
> + isel = id / BITS_PER_LONG;
> + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;

Nit: Redundant parenthesis.

> +
> + /*
> + * Prepare the ID mask to be programmed in the
> + * IMSIC EIEx and EIPx registers. These registers
> + * are XLEN-wide and we must not touch IDs which
> + * are < base_id and >= (base_id + num_id).
> + */
> + ireg = 0;
> + for (i = id & (__riscv_xlen - 1); (id < last_id) && (i < __riscv_xlen); i++) {

Nit: Redundant parenthesis "(id < last_id) && (i < __riscv_xlen)", which
is also inconsistent with other usage in this changeset.

> + ireg |= BIT(i);
> + id++;
> + }
> +
> + /*
> + * The IMSIC EIEx and EIPx registers are indirectly
> + * accessed via using ISELECT and IREG CSRs so we
> + * need to access these CSRs without getting preempted.
> + *
> + * All existing users of this function call this
> + * function with local IRQs disabled so we don't
> + * need to do anything special here.
> + */
> + if (val)
> + imsic_csr_set(isel, ireg);
> + else
> + imsic_csr_clear(isel, ireg);
> + }
> +}
> +
> +/* MUST be called with lpriv->lock held */
> +static void __imsic_local_sync(struct imsic_local_priv *lpriv)
> +{
> + struct imsic_local_config *mlocal;
> + struct imsic_vector *vec, *mvec;
> + int i;
> +
> + /* This pairs with the barrier in __imsic_remote_sync(). */
> + smp_mb();
> +
> + for_each_set_bit(i, lpriv->dirty_bitmap, imsic->global.nr_ids + 1) {
> + if (!i || i == IMSIC_IPI_ID)
> + goto skip;
> + vec = &lpriv->vectors[i];
> +
> + if (vec->enable)
> + __imsic_id_set_enable(i);
> + else
> + __imsic_id_clear_enable(i);
> +
> + /*
> + * If the ID was being moved to a new ID on some other CPU
> + * then we can get a MSI during the movement so check the
> + * ID pending bit and re-trigger the new ID on other CPU
> + * using MMIO write.
> + */
> + mvec = vec->move;
> + vec->move = NULL;
> + if (mvec && mvec != vec) {
> + if (__imsic_id_read_clear_pending(i)) {
> + mlocal = per_cpu_ptr(imsic->global.local, mvec->cpu);
> + writel_relaxed(mvec->local_id, mlocal->msi_va);
> + }
> +
> + imsic_vector_free(&lpriv->vectors[i]);
> + }
> +
> +skip:
> + bitmap_clear(lpriv->dirty_bitmap, i, 1);
> + }
> +}
> +
> +void imsic_local_sync_all(void)
> +{
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + bitmap_fill(lpriv->dirty_bitmap, imsic->global.nr_ids + 1);
> + __imsic_local_sync(lpriv);
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +}
> +
> +void imsic_local_delivery(bool enable)
> +{
> + if (enable) {
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> + return;
> + }
> +
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_local_timer_callback(struct timer_list *timer)
> +{
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + __imsic_local_sync(lpriv);
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +}
> +
> +/* MUST be called with lpriv->lock held */
> +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
> +{
> + /*
> + * Ensure that changes to vector enable, vector move and
> + * dirty bitmap are visible to the target CPU.
> + *
> + * This pairs with the barrier in __imsic_local_sync().
> + */
> + smp_mb();
> +
> + /*
> + * We schedule a timer on the target CPU if the target CPU is not
> + * same as the current CPU. An offline CPU will unconditionally
> + * synchronize IDs through imsic_starting_cpu() when the
> + * CPU is brought up.
> + */
> + if (cpu_online(cpu)) {
> + if (cpu != smp_processor_id()) {
> + if (!timer_pending(&lpriv->timer)) {
> + lpriv->timer.expires = jiffies + 1;
> + add_timer_on(&lpriv->timer, cpu);
> + }
> + } else {
> + __imsic_local_sync(lpriv);
> + }

Nit: Early exit/return vs else-clause for readability


> + }
> +}
> +#else
> +/* MUST be called with lpriv->lock held */
> +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
> +{
> + __imsic_local_sync(lpriv);
> +}
> +#endif
> +
> +void imsic_vector_mask(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + /*
> + * This function is called through Linux irq subsystem with
> + * irqs disabled so no need to save/restore irq flags.
> + */
> +
> + raw_spin_lock(&lpriv->lock);
> +
> + vec->enable = false;
> + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> + __imsic_remote_sync(lpriv, vec->cpu);
> +
> + raw_spin_unlock(&lpriv->lock);
> +}

Really nice that you're using a timer for the vector affinity change,
and got rid of the special/weird IMSIC/sync IPI. Can you really use a
timer for mask/unmask? That makes the mask/unmask operation
asynchronous!

That was what I was trying to get though with this comment:
https://lore.kernel.org/linux-riscv/[email protected]/

Also, using the smp_* IPI functions, you can pass arguments, so you
don't need the dirty_bitmap tracking the changes.

> +
> +void imsic_vector_unmask(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + /*
> + * This function is called through Linux irq subsystem with
> + * irqs disabled so no need to save/restore irq flags.
> + */
> +
> + raw_spin_lock(&lpriv->lock);
> +
> + vec->enable = true;
> + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> + __imsic_remote_sync(lpriv, vec->cpu);
> +
> + raw_spin_unlock(&lpriv->lock);
> +}
> +
> +
> +bool imsic_vector_isenabled(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + unsigned long flags;
> + bool ret;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + ret = vec->enable;
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +
> + return ret;
> +}
> +
> +struct imsic_vector *imsic_vector_get_move(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + struct imsic_vector *ret;
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + ret = vec->move;
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +
> + return ret;
> +}
> +
> +static bool imsic_vector_move_update(struct imsic_local_priv *lpriv, struct imsic_vector *vec,
> + bool new_enable, struct imsic_vector *new_move)
> +{
> + unsigned long flags;
> + bool enabled;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> +
> + /* Update enable and move details */
> + enabled = vec->enable;
> + vec->enable = new_enable;
> + vec->move = new_move;
> +
> + /* Mark the vector as dirty and synchronize */
> + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> + __imsic_remote_sync(lpriv, vec->cpu);
> +
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +
> + return enabled;
> +}
> +
> +void imsic_vector_move(struct imsic_vector *old_vec, struct imsic_vector *new_vec)
> +{
> + struct imsic_local_priv *old_lpriv, *new_lpriv;
> + bool enabled;
> +
> + if (WARN_ON(old_vec->cpu == new_vec->cpu))
> + return;
> +
> + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
> + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
> + return;
> +
> + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
> + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
> + return;
> +
> + /*
> + * Move and re-trigger the new vector based on the pending
> + * state of the old vector because we might get a device
> + * interrupt on the old vector while device was being moved
> + * to the new vector.
> + */
> + enabled = imsic_vector_move_update(old_lpriv, old_vec, false, new_vec);
> + imsic_vector_move_update(new_lpriv, new_vec, enabled, new_vec);
> +}
> +
> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> +void imsic_vector_debug_show(struct seq_file *m, struct imsic_vector *vec, int ind)
> +{
> + struct imsic_local_priv *lpriv;
> + struct imsic_vector *mvec;
> + bool is_enabled;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + is_enabled = imsic_vector_isenabled(vec);
> + mvec = imsic_vector_get_move(vec);
> +
> + seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
> + seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
> + seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
> + (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
> + seq_printf(m, "%*sis_enabled : %5u\n", ind, "", (is_enabled) ? 1 : 0);
> + seq_printf(m, "%*sis_move_pending : %5u\n", ind, "", (mvec) ? 1 : 0);

Nit: Redundant parenthesis.

> + if (mvec) {
> + seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mvec->cpu);
> + seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mvec->local_id);
> + }
> +}
> +
> +void imsic_vector_debug_show_summary(struct seq_file *m, int ind)
> +{
> + irq_matrix_debug_show(m, imsic->matrix, ind);
> +}
> +#endif
> +
> +struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu, unsigned int local_id)
> +{
> + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> +
> + if (!lpriv || imsic->global.nr_ids < local_id)
> + return NULL;
> +
> + return &lpriv->vectors[local_id];
> +}
> +
> +struct imsic_vector *imsic_vector_alloc(unsigned int hwirq, const struct cpumask *mask)
> +{
> + struct imsic_vector *vec = NULL;
> + struct imsic_local_priv *lpriv;
> + unsigned long flags;
> + unsigned int cpu;
> + int local_id;
> +
> + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> + local_id = irq_matrix_alloc(imsic->matrix, mask, false, &cpu);
> + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> + if (local_id < 0)
> + return NULL;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> + vec = &lpriv->vectors[local_id];
> + vec->hwirq = hwirq;
> + vec->enable = false;
> + vec->move = NULL;
> +
> + return vec;
> +}
> +
> +void imsic_vector_free(struct imsic_vector *vec)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> + vec->hwirq = UINT_MAX;
> + irq_matrix_free(imsic->matrix, vec->cpu, vec->local_id, false);
> + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> +}
> +
> +static void __init imsic_local_cleanup(void)
> +{
> + int cpu;
> + struct imsic_local_priv *lpriv;
> +
> + for_each_possible_cpu(cpu) {
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> +
> + bitmap_free(lpriv->dirty_bitmap);
> + kfree(lpriv->vectors);
> + }
> +
> + free_percpu(imsic->lpriv);
> +}
> +
> +static int __init imsic_local_init(void)
> +{
> + struct imsic_global_config *global = &imsic->global;
> + struct imsic_local_priv *lpriv;
> + struct imsic_vector *vec;
> + int cpu, i;
> +
> + /* Allocate per-CPU private state */
> + imsic->lpriv = alloc_percpu(typeof(*(imsic->lpriv)));
> + if (!imsic->lpriv)
> + return -ENOMEM;
> +
> + /* Setup per-CPU private state */
> + for_each_possible_cpu(cpu) {
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> +
> + raw_spin_lock_init(&lpriv->lock);
> +
> + /* Allocate dirty bitmap */
> + lpriv->dirty_bitmap = bitmap_zalloc(global->nr_ids + 1, GFP_KERNEL);
> + if (!lpriv->dirty_bitmap)
> + goto fail_local_cleanup;
> +
> +#ifdef CONFIG_SMP
> + /* Setup lazy timer for synchronization */
> + timer_setup(&lpriv->timer, imsic_local_timer_callback, TIMER_PINNED);
> +#endif
> +
> + /* Allocate vector array */
> + lpriv->vectors = kcalloc(global->nr_ids + 1, sizeof(*lpriv->vectors),
> + GFP_KERNEL);
> + if (!lpriv->vectors)
> + goto fail_local_cleanup;
> +
> + /* Setup vector array */
> + for (i = 0; i <= global->nr_ids; i++) {
> + vec = &lpriv->vectors[i];
> + vec->cpu = cpu;
> + vec->local_id = i;
> + vec->hwirq = UINT_MAX;
> + }
> + }
> +
> + return 0;
> +
> +fail_local_cleanup:
> + imsic_local_cleanup();
> + return -ENOMEM;
> +}
> +
> +void imsic_state_online(void)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> + irq_matrix_online(imsic->matrix);
> + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> +}
> +
> +void imsic_state_offline(void)
> +{
> +#ifdef CONFIG_SMP
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> +#endif
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> + irq_matrix_offline(imsic->matrix);
> + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> +
> +#ifdef CONFIG_SMP
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + WARN_ON_ONCE(try_to_del_timer_sync(&lpriv->timer) < 0);
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +#endif
> +}
> +
> +static int __init imsic_matrix_init(void)
> +{
> + struct imsic_global_config *global = &imsic->global;
> +
> + raw_spin_lock_init(&imsic->matrix_lock);
> + imsic->matrix = irq_alloc_matrix(global->nr_ids + 1,
> + 0, global->nr_ids + 1);
> + if (!imsic->matrix)
> + return -ENOMEM;
> +
> + /* Reserve ID#0 because it is special and never implemented */
> + irq_matrix_assign_system(imsic->matrix, 0, false);
> +
> + /* Reserve IPI ID because it is special and used internally */
> + irq_matrix_assign_system(imsic->matrix, IMSIC_IPI_ID, false);
> +
> + return 0;
> +}
> +
> +static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
> + u32 index, unsigned long *hartid)
> +{
> + struct of_phandle_args parent;
> + int rc;
> +
> + /*
> + * Currently, only OF fwnode is supported so extend this
> + * function for ACPI support.
> + */
> + if (!is_of_node(fwnode))
> + return -EINVAL;
> +
> + rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> + if (rc)
> + return rc;
> +
> + /*
> + * Skip interrupts other than external interrupts for
> + * current privilege level.
> + */
> + if (parent.args[0] != RV_IRQ_EXT)
> + return -EINVAL;
> +
> + return riscv_of_parent_hartid(parent.np, hartid);
> +}
> +
> +static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
> + u32 index, struct resource *res)
> +{
> + /*
> + * Currently, only OF fwnode is supported so extend this
> + * function for ACPI support.
> + */
> + if (!is_of_node(fwnode))
> + return -EINVAL;
> +
> + return of_address_to_resource(to_of_node(fwnode), index, res);
> +}
> +
> +static int __init imsic_parse_fwnode(struct fwnode_handle *fwnode,
> + struct imsic_global_config *global,
> + u32 *nr_parent_irqs,
> + u32 *nr_mmios)
> +{
> + unsigned long hartid;
> + struct resource res;
> + int rc;
> + u32 i;
> +
> + /*
> + * Currently, only OF fwnode is supported so extend this
> + * function for ACPI support.
> + */
> + if (!is_of_node(fwnode))
> + return -EINVAL;
> +
> + *nr_parent_irqs = 0;
> + *nr_mmios = 0;
> +
> + /* Find number of parent interrupts */
> + while (!imsic_get_parent_hartid(fwnode, *nr_parent_irqs, &hartid))
> + (*nr_parent_irqs)++;
> + if (!(*nr_parent_irqs)) {

Nit: Redundant parenthesis

> + pr_err("%pfwP: no parent irqs available\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of guest index bits in MSI address */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,guest-index-bits",
> + &global->guest_index_bits);
> + if (rc)
> + global->guest_index_bits = 0;
> +
> + /* Find number of HART index bits */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,hart-index-bits",
> + &global->hart_index_bits);
> + if (rc) {
> + /* Assume default value */
> + global->hart_index_bits = __fls(*nr_parent_irqs);
> + if (BIT(global->hart_index_bits) < *nr_parent_irqs)
> + global->hart_index_bits++;
> + }
> +
> + /* Find number of group index bits */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-bits",
> + &global->group_index_bits);
> + if (rc)
> + global->group_index_bits = 0;
> +
> + /*
> + * Find first bit position of group index.
> + * If not specified assumed the default APLIC-IMSIC configuration.
> + */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-shift",
> + &global->group_index_shift);
> + if (rc)
> + global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> +
> + /* Find number of interrupt identities */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-ids",
> + &global->nr_ids);
> + if (rc) {
> + pr_err("%pfwP: number of interrupt identities not found\n",
> + fwnode);
> + return rc;
> + }
> +
> + /* Find number of guest interrupt identities */
> + rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-guest-ids",
> + &global->nr_guest_ids);
> + if (rc)
> + global->nr_guest_ids = global->nr_ids;
> +
> + /* Sanity check guest index bits */
> + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> + if (i < global->guest_index_bits) {
> + pr_err("%pfwP: guest index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Sanity check HART index bits */
> + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
> + if (i < global->hart_index_bits) {
> + pr_err("%pfwP: HART index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Sanity check group index bits */
> + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> + global->guest_index_bits - global->hart_index_bits;
> + if (i < global->group_index_bits) {
> + pr_err("%pfwP: group index bits too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Sanity check group index shift */
> + i = global->group_index_bits + global->group_index_shift - 1;
> + if (i >= BITS_PER_LONG) {
> + pr_err("%pfwP: group index shift too big\n", fwnode);
> + return -EINVAL;
> + }
> +
> + /* Sanity check number of interrupt identities */
> + if ((global->nr_ids < IMSIC_MIN_ID) ||
> + (global->nr_ids >= IMSIC_MAX_ID) ||
> + ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> + pr_err("%pfwP: invalid number of interrupt identities\n",
> + fwnode);

Nit: 100 chars

> + return -EINVAL;
> + }
> +
> + /* Sanity check number of guest interrupt identities */
> + if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> + (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> + ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> + pr_err("%pfwP: invalid number of guest interrupt identities\n",
> + fwnode);

Nit: 100 chars

> + return -EINVAL;
> + }
> +
> + /* Compute base address */
> + rc = imsic_get_mmio_resource(fwnode, 0, &res);
> + if (rc) {
> + pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> + return -EINVAL;
> + }
> + global->base_addr = res.start;
> + global->base_addr &= ~(BIT(global->guest_index_bits +
> + global->hart_index_bits +
> + IMSIC_MMIO_PAGE_SHIFT) - 1);
> + global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> + global->group_index_shift);
> +
> + /* Find number of MMIO register sets */
> + while (!imsic_get_mmio_resource(fwnode, *nr_mmios, &res))
> + (*nr_mmios)++;
> +
> + return 0;
> +}
> +
> +int __init imsic_setup_state(struct fwnode_handle *fwnode)
> +{
> + u32 i, j, index, nr_parent_irqs, nr_mmios, nr_handlers = 0;
> + struct imsic_global_config *global;
> + struct imsic_local_config *local;
> + void __iomem **mmios_va = NULL;
> + struct resource *mmios = NULL;
> + unsigned long reloff, hartid;
> + phys_addr_t base_addr;
> + int rc, cpu;
> +
> + /*
> + * Only one IMSIC instance allowed in a platform for clean
> + * implementation of SMP IRQ affinity and per-CPU IPIs.
> + *
> + * This means on a multi-socket (or multi-die) platform we
> + * will have multiple MMIO regions for one IMSIC instance.
> + */
> + if (imsic) {
> + pr_err("%pfwP: already initialized hence ignoring\n",
> + fwnode);

Nit: 100 chars

> + return -EALREADY;
> + }
> +
> + if (!riscv_isa_extension_available(NULL, SxAIA)) {
> + pr_err("%pfwP: AIA support not available\n", fwnode);
> + return -ENODEV;
> + }
> +
> + imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
> + if (!imsic)
> + return -ENOMEM;
> + imsic->fwnode = fwnode;
> + global = &imsic->global;
> +
> + global->local = alloc_percpu(typeof(*(global->local)));
> + if (!global->local) {
> + rc = -ENOMEM;
> + goto out_free_priv;
> + }
> +
> + /* Parse IMSIC fwnode */
> + rc = imsic_parse_fwnode(fwnode, global, &nr_parent_irqs, &nr_mmios);
> + if (rc)
> + goto out_free_local;
> +
> + /* Allocate MMIO resource array */
> + mmios = kcalloc(nr_mmios, sizeof(*mmios), GFP_KERNEL);
> + if (!mmios) {
> + rc = -ENOMEM;
> + goto out_free_local;
> + }
> +
> + /* Allocate MMIO virtual address array */
> + mmios_va = kcalloc(nr_mmios, sizeof(*mmios_va), GFP_KERNEL);
> + if (!mmios_va) {
> + rc = -ENOMEM;
> + goto out_iounmap;
> + }
> +
> + /* Parse and map MMIO register sets */
> + for (i = 0; i < nr_mmios; i++) {
> + rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
> + if (rc) {
> + pr_err("%pfwP: unable to parse MMIO regset %d\n",
> + fwnode, i);

Nit: 100 chars

> + goto out_iounmap;
> + }
> +
> + base_addr = mmios[i].start;
> + base_addr &= ~(BIT(global->guest_index_bits +
> + global->hart_index_bits +
> + IMSIC_MMIO_PAGE_SHIFT) - 1);
> + base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> + global->group_index_shift);
> + if (base_addr != global->base_addr) {
> + rc = -EINVAL;
> + pr_err("%pfwP: address mismatch for regset %d\n",
> + fwnode, i);

Nit: 100 chars... and all the places below where applicable.


Björn

2024-02-20 11:53:22

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 00/13] Linux RISC-V AIA Support

Anup Patel <[email protected]> writes:

> The RISC-V AIA specification is ratified as-per the RISC-V international
> process. The latest ratified AIA specifcation can be found at:
> https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
>
> At a high-level, the AIA specification adds three things:
> 1) AIA CSRs
> - Improved local interrupt support
> 2) Incoming Message Signaled Interrupt Controller (IMSIC)
> - Per-HART MSI controller
> - Support MSI virtualization
> - Support IPI along with virtualization
> 3) Advanced Platform-Level Interrupt Controller (APLIC)
> - Wired interrupt controller
> - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
> - In Direct-mode, injects external interrupts directly into HARTs
>
> For an overview of the AIA specification, refer the AIA virtualization
> talk at KVM Forum 2022:
> https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
> https://www.youtube.com/watch?v=r071dL8Z0yo
>
> To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).
>
> This series depends upon per-device MSI domain patches merged by Thomas (tglx)
> which are available in irq/msi branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
>
> These patches can also be found in the riscv_aia_v13 branch at:
> https://github.com/avpatel/linux.git
>
> Changes since v12:
> - Rebased on Linux-6.8-rc5
> - Dropped per-device MSI domain patches which are already merged by Thomas (tglx)
> - Addressed nit comments from Thomas and Clement
> - Added a new patch2 to fix lock dependency warning
> - Replaced local sync IPI in the IMSIC driver with per-CPU timer
> - Simplified locking in the IMSIC driver to avoid lock dependency issues
> - Added a dirty bitmap in the IMSIC driver to optimize per-CPU local sync loop

Thanks, Anup.

I will take it for a spin, with Alex' v1 of the stop_machine()/ftrace
IPI fix.

The defconfig change (12/13)breaks a bunch a builds:
https://patchwork.kernel.org/project/linux-riscv/list/?series=827706

Download the logs here:
https://github.com/linux-riscv/linux-riscv/suites/20917102160/logs?attempt=1
and grep for '##[error]'

Björn

2024-02-20 11:53:35

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

Anup Patel <[email protected]> writes:

> The Linux platform MSI support allows per-device MSI domains so let
> us add a platform irqchip driver for RISC-V IMSIC which provides a
> base IRQ domain with MSI parent support for platform device domains.
>
> This driver assumes that the IMSIC state is already initialized by
> the IMSIC early driver.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Makefile | 2 +-
> drivers/irqchip/irq-riscv-imsic-platform.c | 346 +++++++++++++++++++++
> drivers/irqchip/irq-riscv-imsic-state.h | 1 +
> 3 files changed, 348 insertions(+), 1 deletion(-)
> create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c
>
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index d714724387ce..abca445a3229 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -95,7 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> -obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
> new file mode 100644
> index 000000000000..7ee44c493dbc
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic-platform.c
> @@ -0,0 +1,346 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/bitmap.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/platform_device.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +
> +#include "irq-riscv-imsic-state.h"
> +
> +static bool imsic_cpu_page_phys(unsigned int cpu, unsigned int guest_index,
> + phys_addr_t *out_msi_pa)
> +{
> + struct imsic_global_config *global;
> + struct imsic_local_config *local;
> +
> + global = &imsic->global;
> + local = per_cpu_ptr(global->local, cpu);
> +
> + if (BIT(global->guest_index_bits) <= guest_index)
> + return false;
> +
> + if (out_msi_pa)
> + *out_msi_pa = local->msi_pa +
> + (guest_index * IMSIC_MMIO_PAGE_SZ);
> +
> + return true;
> +}
> +
> +static void imsic_irq_mask(struct irq_data *d)
> +{
> + imsic_vector_mask(irq_data_get_irq_chip_data(d));
> +}
> +
> +static void imsic_irq_unmask(struct irq_data *d)
> +{
> + imsic_vector_unmask(irq_data_get_irq_chip_data(d));
> +}
> +
> +static int imsic_irq_retrigger(struct irq_data *d)
> +{
> + struct imsic_vector *vec = irq_data_get_irq_chip_data(d);
> + struct imsic_local_config *local;
> +
> + if (WARN_ON(vec == NULL))

Checkpatch: use !vec

> + return -ENOENT;
> +
> + local = per_cpu_ptr(imsic->global.local, vec->cpu);
> + writel_relaxed(vec->local_id, local->msi_va);
> + return 0;
> +}
> +
> +static void imsic_irq_compose_vector_msg(struct imsic_vector *vec, struct msi_msg *msg)
> +{
> + phys_addr_t msi_addr;
> +
> + if (WARN_ON(vec == NULL))

Checkpatch: use !vec

> + return;
> +
> + if (WARN_ON(!imsic_cpu_page_phys(vec->cpu, 0, &msi_addr)))
> + return;
> +
> + msg->address_hi = upper_32_bits(msi_addr);
> + msg->address_lo = lower_32_bits(msi_addr);
> + msg->data = vec->local_id;
> +}
> +
> +static void imsic_irq_compose_msg(struct irq_data *d, struct msi_msg *msg)
> +{
> + imsic_irq_compose_vector_msg(irq_data_get_irq_chip_data(d), msg);
> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
> +{
> + struct msi_msg msg[2] = { [1] = { }, };
> +
> + imsic_irq_compose_vector_msg(vec, msg);
> + irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
> +}
> +
> +static int imsic_irq_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> + bool force)
> +{
> + struct imsic_vector *old_vec, *new_vec;
> + struct irq_data *pd = d->parent_data;
> +
> + old_vec = irq_data_get_irq_chip_data(pd);
> + if (WARN_ON(old_vec == NULL))

Checkpatch: use !old_vec

> + return -ENOENT;
> +
> + /* If old vector cpu belongs to the target cpumask then do nothing */
> + if (cpumask_test_cpu(old_vec->cpu, mask_val))
> + return IRQ_SET_MASK_OK_DONE;
> +
> + /* If move is already in-flight then return failure */
> + if (imsic_vector_get_move(old_vec))
> + return -EBUSY;
> +
> + /* Get a new vector on the desired set of CPUs */
> + new_vec = imsic_vector_alloc(old_vec->hwirq, mask_val);
> + if (!new_vec)
> + return -ENOSPC;
> +
> + /* Point device to the new vector */
> + imsic_msi_update_msg(d, new_vec);
> +
> + /* Update irq descriptors with the new vector */
> + pd->chip_data = new_vec;
> +
> + /* Update effective affinity of parent irq data */
> + irq_data_update_effective_affinity(pd, cpumask_of(new_vec->cpu));
> +
> + /* Move state of the old vector to the new vector */
> + imsic_vector_move(old_vec, new_vec);
> +
> + return IRQ_SET_MASK_OK_DONE;
> +}
> +#endif
> +
> +static struct irq_chip imsic_irq_base_chip = {
> + .name = "IMSIC",
> + .irq_mask = imsic_irq_mask,
> + .irq_unmask = imsic_irq_unmask,
> + .irq_retrigger = imsic_irq_retrigger,
> + .irq_compose_msi_msg = imsic_irq_compose_msg,
> + .flags = IRQCHIP_SKIP_SET_WAKE |
> + IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int imsic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> + unsigned int nr_irqs, void *args)
> +{
> + struct imsic_vector *vec;
> +
> + /* Legacy-MSI or multi-MSI not supported yet. */
> + if (nr_irqs > 1)
> + return -ENOTSUPP;

Checkpatch: WARNING: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP


Björn

2024-02-20 11:53:48

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

Anup Patel <[email protected]> writes:

> The RISC-V advanced interrupt architecture (AIA) specification
> defines a new MSI controller called incoming message signalled
> interrupt controller (IMSIC) which manages MSI on per-HART (or
> per-CPU) basis. It also supports IPIs as software injected MSIs.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> Let us add an early irqchip driver for RISC-V IMSIC which sets
> up the IMSIC state and provide IPIs.
>
> Signed-off-by: Anup Patel <[email protected]>

This patch has a couple of checkpatch issues:

CHECK: Alignment should match open parenthesis
CHECK: Please don't use multiple blank lines
CHECK: Please use a blank line after function/struct/union/enum declarations
CHECK: Unnecessary parentheses around 'global->nr_guest_ids < IMSIC_MIN_ID'
CHECK: Unnecessary parentheses around 'global->nr_guest_ids >= IMSIC_MAX_ID'
CHECK: Unnecessary parentheses around 'global->nr_ids < IMSIC_MIN_ID'
CHECK: Unnecessary parentheses around 'global->nr_ids >= IMSIC_MAX_ID'
CHECK: Unnecessary parentheses around global->local
CHECK: Unnecessary parentheses around imsic->lpriv
CHECK: extern prototypes should be avoided in .h files

Björn

2024-02-20 13:01:39

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Feb 20, 2024 at 5:22 PM Björn Töpel <bjorn@kernelorg> wrote:
>
> Anup,
>
> This version is so much easier to follow! Thanks a lot for then
> cleanups/design changes.
>
> A bunch of nits, and a major one, below.

Sure, I will address the nits in the next revision.

>
> Anup Patel <[email protected]> writes:
>
> > The RISC-V advanced interrupt architecture (AIA) specification
> > defines a new MSI controller called incoming message signalled
> > interrupt controller (IMSIC) which manages MSI on per-HART (or
> > per-CPU) basis. It also supports IPIs as software injected MSIs.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > Let us add an early irqchip driver for RISC-V IMSIC which sets
> > up the IMSIC state and provide IPIs.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Kconfig | 7 +
> > drivers/irqchip/Makefile | 1 +
> > drivers/irqchip/irq-riscv-imsic-early.c | 213 ++++++
> > drivers/irqchip/irq-riscv-imsic-state.c | 906 ++++++++++++++++++++++++
> > drivers/irqchip/irq-riscv-imsic-state.h | 98 +++
> > include/linux/irqchip/riscv-imsic.h | 87 +++
> > 6 files changed, 1312 insertions(+)
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
> > create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index f7149d0f3d45..85f86e31c996 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -546,6 +546,13 @@ config SIFIVE_PLIC
> > select IRQ_DOMAIN_HIERARCHY
> > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > + bool
> > + depends on RISCV
> > + select IRQ_DOMAIN_HIERARCHY
> > + select GENERIC_IRQ_MATRIX_ALLOCATOR
> > + select GENERIC_MSI_IRQ
> > +
> > config EXYNOS_IRQ_COMBINER
> > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index ffd945fe71aa..d714724387ce 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
> > new file mode 100644
> > index 000000000000..32fe428b1c19
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic-early.c
> > @@ -0,0 +1,213 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/module.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +
> > +#include "irq-riscv-imsic-state.h"
> > +
> > +static int imsic_parent_irq;
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > + struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
> > +
> > + writel_relaxed(IMSIC_IPI_ID, local->msi_va);
> > +}
> > +
> > +static void imsic_ipi_starting_cpu(void)
> > +{
> > + /* Enable IPIs for current CPU. */
> > + __imsic_id_set_enable(IMSIC_IPI_ID);
> > +}
> > +
> > +static void imsic_ipi_dying_cpu(void)
> > +{
> > + /* Disable IPIs for current CPU. */
> > + __imsic_id_clear_enable(IMSIC_IPI_ID);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(void)
> > +{
> > + int virq;
> > +
> > + /* Create IMSIC IPI multiplexing */
> > + virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
> > + if (virq <= 0)
> > + return (virq < 0) ? virq : -ENOMEM;
>
> Nit: No parenthesis need to clutter.
>
> > +
> > + /* Set vIRQ range */
> > + riscv_ipi_set_virq_range(virq, IMSIC_NR_IPI, true);
> > +
> > + /* Announce that IMSIC is providing IPIs */
> > + pr_info("%pfwP: providing IPIs using interrupt %d\n", imsic->fwnode, IMSIC_IPI_ID);
> > +
> > + return 0;
> > +}
> > +#else
> > +static void imsic_ipi_starting_cpu(void)
> > +{
> > +}
> > +
> > +static void imsic_ipi_dying_cpu(void)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(void)
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + int err, cpu = smp_processor_id();
> > + struct imsic_vector *vec;
> > + unsigned long local_id;
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> > + local_id = local_id >> TOPEI_ID_SHIFT;
>
> Nit: Wdyt about moving shift into the loop predicate, or using >>=?
>
> > +
> > + if (local_id == IMSIC_IPI_ID) {
> > +#ifdef CONFIG_SMP
> > + ipi_mux_process();
> > +#endif
>
> Is IMSIC_IPI_ID a thing on !IS_ENABLED(CONFIG_SMP)?
>
> > + continue;
> > + }
> > +
> > + if (unlikely(!imsic->base_domain))
> > + continue;
> > +
> > + vec = imsic_vector_from_local_id(cpu, local_id);
> > + if (!vec) {
> > + pr_warn_ratelimited("vector not found for local ID 0x%lx\n", local_id);
> > + continue;
> > + }
> > +
> > + err = generic_handle_domain_irq(imsic->base_domain,
> > + vec->hwirq);
>
> Nit: 100 chars
>
> > + if (unlikely(err))
> > + pr_warn_ratelimited("hwirq 0x%x mapping not found\n", vec->hwirq);
> > + }
> > +
> > + chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > + /* Mark per-CPU IMSIC state as online */
> > + imsic_state_online();
> > +
> > + /* Enable per-CPU parent interrupt */
> > + enable_percpu_irq(imsic_parent_irq, irq_get_trigger_type(imsic_parent_irq));
> > +
> > + /* Setup IPIs */
> > + imsic_ipi_starting_cpu();
> > +
> > + /*
> > + * Interrupts identities might have been enabled/disabled while
> > + * this CPU was not running so sync-up local enable/disable state.
> > + */
> > + imsic_local_sync_all();
> > +
> > + /* Enable local interrupt delivery */
> > + imsic_local_delivery(true);
> > +
> > + return 0;
> > +}
> > +
> > +static int imsic_dying_cpu(unsigned int cpu)
> > +{
> > + /* Cleanup IPIs */
> > + imsic_ipi_dying_cpu();
> > +
> > + /* Mark per-CPU IMSIC state as offline */
> > + imsic_state_offline();
> > +
> > + return 0;
> > +}
> > +
> > +static int __init imsic_early_probe(struct fwnode_handle *fwnode)
> > +{
> > + struct irq_domain *domain;
> > + int rc;
> > +
> > + /* Find parent domain and register chained handler */
> > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
> > + if (!domain) {
> > + pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> > + return -ENOENT;
> > + }
> > + imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > + if (!imsic_parent_irq) {
> > + pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> > + return -ENOENT;
> > + }
> > +
> > + /* Initialize IPI domain */
> > + rc = imsic_ipi_domain_init();
> > + if (rc) {
> > + pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> > + return rc;
> > + }
> > +
> > + /* Setup chained handler to the parent domain interrupt */
> > + irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> > +
> > + /*
> > + * Setup cpuhp state (must be done after setting imsic_parent_irq)
> > + *
> > + * Don't disable per-CPU IMSIC file when CPU goes offline
> > + * because this affects IPI and the masking/unmasking of
> > + * virtual IPIs is done via generic IPI-Mux
> > + */
> > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "irqchip/riscv/imsic:starting",
> > + imsic_starting_cpu, imsic_dying_cpu);
> > +
> > + return 0;
> > +}
> > +
> > +static int __init imsic_early_dt_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + struct fwnode_handle *fwnode = &node->fwnode;
> > + int rc;
> > +
> > + /* Setup IMSIC state */
> > + rc = imsic_setup_state(fwnode);
> > + if (rc) {
> > + pr_err("%pfwP: failed to setup state (error %d)\n",
> > + fwnode, rc);
>
> Nit. 100 chars
>
> > + return rc;
> > + }
> > +
> > + /* Do early setup of IPIs */
> > + rc = imsic_early_probe(fwnode);
> > + if (rc)
> > + return rc;
> > +
> > + /* Ensure that OF platform device gets probed */
> > + of_node_clear_flag(node, OF_POPULATED);
> > + return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
> > diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
> > new file mode 100644
> > index 000000000000..4f347486ec7c
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic-state.c
> > @@ -0,0 +1,906 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/cpu.h>
> > +#include <linux/bitmap.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/irq.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#include "irq-riscv-imsic-state.h"
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY 0
> > +#define IMSIC_ENABLE_EIDELIVERY 1
> > +#define IMSIC_DISABLE_EITHRESHOLD 1
> > +#define IMSIC_ENABLE_EITHRESHOLD 0
> > +
> > +static inline void imsic_csr_write(unsigned long reg, unsigned long val)
> > +{
> > + csr_write(CSR_ISELECT, reg);
> > + csr_write(CSR_IREG, val);
> > +}
> > +
> > +static inline unsigned long imsic_csr_read(unsigned long reg)
> > +{
> > + csr_write(CSR_ISELECT, reg);
> > + return csr_read(CSR_IREG);
> > +}
> > +
> > +static inline unsigned long imsic_csr_read_clear(unsigned long reg, unsigned long val)
> > +{
> > + csr_write(CSR_ISELECT, reg);
> > + return csr_read_clear(CSR_IREG, val);
> > +}
> > +
> > +static inline void imsic_csr_set(unsigned long reg, unsigned long val)
> > +{
> > + csr_write(CSR_ISELECT, reg);
> > + csr_set(CSR_IREG, val);
> > +}
> > +
> > +static inline void imsic_csr_clear(unsigned long reg, unsigned long val)
> > +{
> > + csr_write(CSR_ISELECT, reg);
> > + csr_clear(CSR_IREG, val);
> > +}
> > +
> > +struct imsic_priv *imsic;
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > + return imsic ? &imsic->global : NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +static bool __imsic_eix_read_clear(unsigned long id, bool pend)
> > +{
> > + unsigned long isel, imask;
> > +
> > + isel = id / BITS_PER_LONG;
> > + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > + isel += pend ? IMSIC_EIP0 : IMSIC_EIE0;
> > + imask = BIT(id & (__riscv_xlen - 1));
> > +
> > + return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;
>
> Nit: use return !!(imsic_csr_read_clear(isel, imask) & imask)
>
> > +}
> > +
> > +static inline bool __imsic_id_read_clear_enabled(unsigned long id)
> > +{
> > + return __imsic_eix_read_clear(id, false);
> > +}
> > +
> > +static inline bool __imsic_id_read_clear_pending(unsigned long id)
> > +{
> > + return __imsic_eix_read_clear(id, true);
> > +}
> > +
> > +void __imsic_eix_update(unsigned long base_id, unsigned long num_id, bool pend, bool val)
> > +{
> > + unsigned long id = base_id, last_id = base_id + num_id;
> > + unsigned long i, isel, ireg;
> > +
> > + while (id < last_id) {
> > + isel = id / BITS_PER_LONG;
> > + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
>
> Nit: Redundant parenthesis.
>
> > +
> > + /*
> > + * Prepare the ID mask to be programmed in the
> > + * IMSIC EIEx and EIPx registers. These registers
> > + * are XLEN-wide and we must not touch IDs which
> > + * are < base_id and >= (base_id + num_id).
> > + */
> > + ireg = 0;
> > + for (i = id & (__riscv_xlen - 1); (id < last_id) && (i < __riscv_xlen); i++) {
>
> Nit: Redundant parenthesis "(id < last_id) && (i < __riscv_xlen)", which
> is also inconsistent with other usage in this changeset.
>
> > + ireg |= BIT(i);
> > + id++;
> > + }
> > +
> > + /*
> > + * The IMSIC EIEx and EIPx registers are indirectly
> > + * accessed via using ISELECT and IREG CSRs so we
> > + * need to access these CSRs without getting preempted.
> > + *
> > + * All existing users of this function call this
> > + * function with local IRQs disabled so we don't
> > + * need to do anything special here.
> > + */
> > + if (val)
> > + imsic_csr_set(isel, ireg);
> > + else
> > + imsic_csr_clear(isel, ireg);
> > + }
> > +}
> > +
> > +/* MUST be called with lpriv->lock held */
> > +static void __imsic_local_sync(struct imsic_local_priv *lpriv)
> > +{
> > + struct imsic_local_config *mlocal;
> > + struct imsic_vector *vec, *mvec;
> > + int i;
> > +
> > + /* This pairs with the barrier in __imsic_remote_sync(). */
> > + smp_mb();
> > +
> > + for_each_set_bit(i, lpriv->dirty_bitmap, imsic->global.nr_ids + 1) {
> > + if (!i || i == IMSIC_IPI_ID)
> > + goto skip;
> > + vec = &lpriv->vectors[i];
> > +
> > + if (vec->enable)
> > + __imsic_id_set_enable(i);
> > + else
> > + __imsic_id_clear_enable(i);
> > +
> > + /*
> > + * If the ID was being moved to a new ID on some other CPU
> > + * then we can get a MSI during the movement so check the
> > + * ID pending bit and re-trigger the new ID on other CPU
> > + * using MMIO write.
> > + */
> > + mvec = vec->move;
> > + vec->move = NULL;
> > + if (mvec && mvec != vec) {
> > + if (__imsic_id_read_clear_pending(i)) {
> > + mlocal = per_cpu_ptr(imsic->global.local, mvec->cpu);
> > + writel_relaxed(mvec->local_id, mlocal->msi_va);
> > + }
> > +
> > + imsic_vector_free(&lpriv->vectors[i]);
> > + }
> > +
> > +skip:
> > + bitmap_clear(lpriv->dirty_bitmap, i, 1);
> > + }
> > +}
> > +
> > +void imsic_local_sync_all(void)
> > +{
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + bitmap_fill(lpriv->dirty_bitmap, imsic->global.nr_ids + 1);
> > + __imsic_local_sync(lpriv);
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +}
> > +
> > +void imsic_local_delivery(bool enable)
> > +{
> > + if (enable) {
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > + return;
> > + }
> > +
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_local_timer_callback(struct timer_list *timer)
> > +{
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + __imsic_local_sync(lpriv);
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +}
> > +
> > +/* MUST be called with lpriv->lock held */
> > +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
> > +{
> > + /*
> > + * Ensure that changes to vector enable, vector move and
> > + * dirty bitmap are visible to the target CPU.
> > + *
> > + * This pairs with the barrier in __imsic_local_sync().
> > + */
> > + smp_mb();
> > +
> > + /*
> > + * We schedule a timer on the target CPU if the target CPU is not
> > + * same as the current CPU. An offline CPU will unconditionally
> > + * synchronize IDs through imsic_starting_cpu() when the
> > + * CPU is brought up.
> > + */
> > + if (cpu_online(cpu)) {
> > + if (cpu != smp_processor_id()) {
> > + if (!timer_pending(&lpriv->timer)) {
> > + lpriv->timer.expires = jiffies + 1;
> > + add_timer_on(&lpriv->timer, cpu);
> > + }
> > + } else {
> > + __imsic_local_sync(lpriv);
> > + }
>
> Nit: Early exit/return vs else-clause for readability
>
>
> > + }
> > +}
> > +#else
> > +/* MUST be called with lpriv->lock held */
> > +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
> > +{
> > + __imsic_local_sync(lpriv);
> > +}
> > +#endif
> > +
> > +void imsic_vector_mask(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + /*
> > + * This function is called through Linux irq subsystem with
> > + * irqs disabled so no need to save/restore irq flags.
> > + */
> > +
> > + raw_spin_lock(&lpriv->lock);
> > +
> > + vec->enable = false;
> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> > + __imsic_remote_sync(lpriv, vec->cpu);
> > +
> > + raw_spin_unlock(&lpriv->lock);
> > +}
>
> Really nice that you're using a timer for the vector affinity change,
> and got rid of the special/weird IMSIC/sync IPI. Can you really use a
> timer for mask/unmask? That makes the mask/unmask operation
> asynchronous!
>
> That was what I was trying to get though with this comment:
> https://lore.kernel.org/linux-riscv/[email protected]/
>
> Also, using the smp_* IPI functions, you can pass arguments, so you
> don't need the dirty_bitmap tracking the changes.

The mask/unmask operations are called with irqs disabled so if
CPU X does synchronous IPI to another CPU Y from mask/unmask
operation then while CPU X is waiting for IPI to complete it cannot
receive IPI from other CPUs which can lead to crashes and stalls.

In general, we should not do any block/busy-wait work in
mask/unmask operation of an irqchip driver.

The AIA IMSIC spec allows setting ID pending bit using MSI write
irrespective whether ID is enabled or not but the interrupt will be
taken only after ID is enabled. In other words, there will be no
loss of interrupt with delayed mask/unmask using async IPI or
lazy timer.

>
> > +
> > +void imsic_vector_unmask(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + /*
> > + * This function is called through Linux irq subsystem with
> > + * irqs disabled so no need to save/restore irq flags.
> > + */
> > +
> > + raw_spin_lock(&lpriv->lock);
> > +
> > + vec->enable = true;
> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> > + __imsic_remote_sync(lpriv, vec->cpu);
> > +
> > + raw_spin_unlock(&lpriv->lock);
> > +}
> > +
> > +
> > +bool imsic_vector_isenabled(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + unsigned long flags;
> > + bool ret;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + ret = vec->enable;
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +
> > + return ret;
> > +}
> > +
> > +struct imsic_vector *imsic_vector_get_move(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + struct imsic_vector *ret;
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + ret = vec->move;
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +
> > + return ret;
> > +}
> > +
> > +static bool imsic_vector_move_update(struct imsic_local_priv *lpriv, struct imsic_vector *vec,
> > + bool new_enable, struct imsic_vector *new_move)
> > +{
> > + unsigned long flags;
> > + bool enabled;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > +
> > + /* Update enable and move details */
> > + enabled = vec->enable;
> > + vec->enable = new_enable;
> > + vec->move = new_move;
> > +
> > + /* Mark the vector as dirty and synchronize */
> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> > + __imsic_remote_sync(lpriv, vec->cpu);
> > +
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +
> > + return enabled;
> > +}
> > +
> > +void imsic_vector_move(struct imsic_vector *old_vec, struct imsic_vector *new_vec)
> > +{
> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
> > + bool enabled;
> > +
> > + if (WARN_ON(old_vec->cpu == new_vec->cpu))
> > + return;
> > +
> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
> > + return;
> > +
> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
> > + return;
> > +
> > + /*
> > + * Move and re-trigger the new vector based on the pending
> > + * state of the old vector because we might get a device
> > + * interrupt on the old vector while device was being moved
> > + * to the new vector.
> > + */
> > + enabled = imsic_vector_move_update(old_lpriv, old_vec, false, new_vec);
> > + imsic_vector_move_update(new_lpriv, new_vec, enabled, new_vec);
> > +}
> > +
> > +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> > +void imsic_vector_debug_show(struct seq_file *m, struct imsic_vector *vec, int ind)
> > +{
> > + struct imsic_local_priv *lpriv;
> > + struct imsic_vector *mvec;
> > + bool is_enabled;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + is_enabled = imsic_vector_isenabled(vec);
> > + mvec = imsic_vector_get_move(vec);
> > +
> > + seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
> > + seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
> > + seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
> > + (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
> > + seq_printf(m, "%*sis_enabled : %5u\n", ind, "", (is_enabled) ? 1 : 0);
> > + seq_printf(m, "%*sis_move_pending : %5u\n", ind, "", (mvec) ? 1 : 0);
>
> Nit: Redundant parenthesis.
>
> > + if (mvec) {
> > + seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mvec->cpu);
> > + seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mvec->local_id);
> > + }
> > +}
> > +
> > +void imsic_vector_debug_show_summary(struct seq_file *m, int ind)
> > +{
> > + irq_matrix_debug_show(m, imsic->matrix, ind);
> > +}
> > +#endif
> > +
> > +struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu, unsigned int local_id)
> > +{
> > + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> > +
> > + if (!lpriv || imsic->global.nr_ids < local_id)
> > + return NULL;
> > +
> > + return &lpriv->vectors[local_id];
> > +}
> > +
> > +struct imsic_vector *imsic_vector_alloc(unsigned int hwirq, const struct cpumask *mask)
> > +{
> > + struct imsic_vector *vec = NULL;
> > + struct imsic_local_priv *lpriv;
> > + unsigned long flags;
> > + unsigned int cpu;
> > + int local_id;
> > +
> > + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> > + local_id = irq_matrix_alloc(imsic->matrix, mask, false, &cpu);
> > + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> > + if (local_id < 0)
> > + return NULL;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> > + vec = &lpriv->vectors[local_id];
> > + vec->hwirq = hwirq;
> > + vec->enable = false;
> > + vec->move = NULL;
> > +
> > + return vec;
> > +}
> > +
> > +void imsic_vector_free(struct imsic_vector *vec)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> > + vec->hwirq = UINT_MAX;
> > + irq_matrix_free(imsic->matrix, vec->cpu, vec->local_id, false);
> > + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> > +}
> > +
> > +static void __init imsic_local_cleanup(void)
> > +{
> > + int cpu;
> > + struct imsic_local_priv *lpriv;
> > +
> > + for_each_possible_cpu(cpu) {
> > + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> > +
> > + bitmap_free(lpriv->dirty_bitmap);
> > + kfree(lpriv->vectors);
> > + }
> > +
> > + free_percpu(imsic->lpriv);
> > +}
> > +
> > +static int __init imsic_local_init(void)
> > +{
> > + struct imsic_global_config *global = &imsic->global;
> > + struct imsic_local_priv *lpriv;
> > + struct imsic_vector *vec;
> > + int cpu, i;
> > +
> > + /* Allocate per-CPU private state */
> > + imsic->lpriv = alloc_percpu(typeof(*(imsic->lpriv)));
> > + if (!imsic->lpriv)
> > + return -ENOMEM;
> > +
> > + /* Setup per-CPU private state */
> > + for_each_possible_cpu(cpu) {
> > + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> > +
> > + raw_spin_lock_init(&lpriv->lock);
> > +
> > + /* Allocate dirty bitmap */
> > + lpriv->dirty_bitmap = bitmap_zalloc(global->nr_ids + 1, GFP_KERNEL);
> > + if (!lpriv->dirty_bitmap)
> > + goto fail_local_cleanup;
> > +
> > +#ifdef CONFIG_SMP
> > + /* Setup lazy timer for synchronization */
> > + timer_setup(&lpriv->timer, imsic_local_timer_callback, TIMER_PINNED);
> > +#endif
> > +
> > + /* Allocate vector array */
> > + lpriv->vectors = kcalloc(global->nr_ids + 1, sizeof(*lpriv->vectors),
> > + GFP_KERNEL);
> > + if (!lpriv->vectors)
> > + goto fail_local_cleanup;
> > +
> > + /* Setup vector array */
> > + for (i = 0; i <= global->nr_ids; i++) {
> > + vec = &lpriv->vectors[i];
> > + vec->cpu = cpu;
> > + vec->local_id = i;
> > + vec->hwirq = UINT_MAX;
> > + }
> > + }
> > +
> > + return 0;
> > +
> > +fail_local_cleanup:
> > + imsic_local_cleanup();
> > + return -ENOMEM;
> > +}
> > +
> > +void imsic_state_online(void)
> > +{
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> > + irq_matrix_online(imsic->matrix);
> > + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> > +}
> > +
> > +void imsic_state_offline(void)
> > +{
> > +#ifdef CONFIG_SMP
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > +#endif
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> > + irq_matrix_offline(imsic->matrix);
> > + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> > +
> > +#ifdef CONFIG_SMP
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + WARN_ON_ONCE(try_to_del_timer_sync(&lpriv->timer) < 0);
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +#endif
> > +}
> > +
> > +static int __init imsic_matrix_init(void)
> > +{
> > + struct imsic_global_config *global = &imsic->global;
> > +
> > + raw_spin_lock_init(&imsic->matrix_lock);
> > + imsic->matrix = irq_alloc_matrix(global->nr_ids + 1,
> > + 0, global->nr_ids + 1);
> > + if (!imsic->matrix)
> > + return -ENOMEM;
> > +
> > + /* Reserve ID#0 because it is special and never implemented */
> > + irq_matrix_assign_system(imsic->matrix, 0, false);
> > +
> > + /* Reserve IPI ID because it is special and used internally */
> > + irq_matrix_assign_system(imsic->matrix, IMSIC_IPI_ID, false);
> > +
> > + return 0;
> > +}
> > +
> > +static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
> > + u32 index, unsigned long *hartid)
> > +{
> > + struct of_phandle_args parent;
> > + int rc;
> > +
> > + /*
> > + * Currently, only OF fwnode is supported so extend this
> > + * function for ACPI support.
> > + */
> > + if (!is_of_node(fwnode))
> > + return -EINVAL;
> > +
> > + rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
> > + if (rc)
> > + return rc;
> > +
> > + /*
> > + * Skip interrupts other than external interrupts for
> > + * current privilege level.
> > + */
> > + if (parent.args[0] != RV_IRQ_EXT)
> > + return -EINVAL;
> > +
> > + return riscv_of_parent_hartid(parent.np, hartid);
> > +}
> > +
> > +static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
> > + u32 index, struct resource *res)
> > +{
> > + /*
> > + * Currently, only OF fwnode is supported so extend this
> > + * function for ACPI support.
> > + */
> > + if (!is_of_node(fwnode))
> > + return -EINVAL;
> > +
> > + return of_address_to_resource(to_of_node(fwnode), index, res);
> > +}
> > +
> > +static int __init imsic_parse_fwnode(struct fwnode_handle *fwnode,
> > + struct imsic_global_config *global,
> > + u32 *nr_parent_irqs,
> > + u32 *nr_mmios)
> > +{
> > + unsigned long hartid;
> > + struct resource res;
> > + int rc;
> > + u32 i;
> > +
> > + /*
> > + * Currently, only OF fwnode is supported so extend this
> > + * function for ACPI support.
> > + */
> > + if (!is_of_node(fwnode))
> > + return -EINVAL;
> > +
> > + *nr_parent_irqs = 0;
> > + *nr_mmios = 0;
> > +
> > + /* Find number of parent interrupts */
> > + while (!imsic_get_parent_hartid(fwnode, *nr_parent_irqs, &hartid))
> > + (*nr_parent_irqs)++;
> > + if (!(*nr_parent_irqs)) {
>
> Nit: Redundant parenthesis
>
> > + pr_err("%pfwP: no parent irqs available\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of guest index bits in MSI address */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,guest-index-bits",
> > + &global->guest_index_bits);
> > + if (rc)
> > + global->guest_index_bits = 0;
> > +
> > + /* Find number of HART index bits */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,hart-index-bits",
> > + &global->hart_index_bits);
> > + if (rc) {
> > + /* Assume default value */
> > + global->hart_index_bits = __fls(*nr_parent_irqs);
> > + if (BIT(global->hart_index_bits) < *nr_parent_irqs)
> > + global->hart_index_bits++;
> > + }
> > +
> > + /* Find number of group index bits */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-bits",
> > + &global->group_index_bits);
> > + if (rc)
> > + global->group_index_bits = 0;
> > +
> > + /*
> > + * Find first bit position of group index.
> > + * If not specified assumed the default APLIC-IMSIC configuration.
> > + */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,group-index-shift",
> > + &global->group_index_shift);
> > + if (rc)
> > + global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
> > +
> > + /* Find number of interrupt identities */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-ids",
> > + &global->nr_ids);
> > + if (rc) {
> > + pr_err("%pfwP: number of interrupt identities not found\n",
> > + fwnode);
> > + return rc;
> > + }
> > +
> > + /* Find number of guest interrupt identities */
> > + rc = of_property_read_u32(to_of_node(fwnode), "riscv,num-guest-ids",
> > + &global->nr_guest_ids);
> > + if (rc)
> > + global->nr_guest_ids = global->nr_ids;
> > +
> > + /* Sanity check guest index bits */
> > + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> > + if (i < global->guest_index_bits) {
> > + pr_err("%pfwP: guest index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Sanity check HART index bits */
> > + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
> > + if (i < global->hart_index_bits) {
> > + pr_err("%pfwP: HART index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Sanity check group index bits */
> > + i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
> > + global->guest_index_bits - global->hart_index_bits;
> > + if (i < global->group_index_bits) {
> > + pr_err("%pfwP: group index bits too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Sanity check group index shift */
> > + i = global->group_index_bits + global->group_index_shift - 1;
> > + if (i >= BITS_PER_LONG) {
> > + pr_err("%pfwP: group index shift too big\n", fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Sanity check number of interrupt identities */
> > + if ((global->nr_ids < IMSIC_MIN_ID) ||
> > + (global->nr_ids >= IMSIC_MAX_ID) ||
> > + ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > + pr_err("%pfwP: invalid number of interrupt identities\n",
> > + fwnode);
>
> Nit: 100 chars
>
> > + return -EINVAL;
> > + }
> > +
> > + /* Sanity check number of guest interrupt identities */
> > + if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
> > + (global->nr_guest_ids >= IMSIC_MAX_ID) ||
> > + ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
> > + pr_err("%pfwP: invalid number of guest interrupt identities\n",
> > + fwnode);
>
> Nit: 100 chars
>
> > + return -EINVAL;
> > + }
> > +
> > + /* Compute base address */
> > + rc = imsic_get_mmio_resource(fwnode, 0, &res);
> > + if (rc) {
> > + pr_err("%pfwP: first MMIO resource not found\n", fwnode);
> > + return -EINVAL;
> > + }
> > + global->base_addr = res.start;
> > + global->base_addr &= ~(BIT(global->guest_index_bits +
> > + global->hart_index_bits +
> > + IMSIC_MMIO_PAGE_SHIFT) - 1);
> > + global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > + global->group_index_shift);
> > +
> > + /* Find number of MMIO register sets */
> > + while (!imsic_get_mmio_resource(fwnode, *nr_mmios, &res))
> > + (*nr_mmios)++;
> > +
> > + return 0;
> > +}
> > +
> > +int __init imsic_setup_state(struct fwnode_handle *fwnode)
> > +{
> > + u32 i, j, index, nr_parent_irqs, nr_mmios, nr_handlers = 0;
> > + struct imsic_global_config *global;
> > + struct imsic_local_config *local;
> > + void __iomem **mmios_va = NULL;
> > + struct resource *mmios = NULL;
> > + unsigned long reloff, hartid;
> > + phys_addr_t base_addr;
> > + int rc, cpu;
> > +
> > + /*
> > + * Only one IMSIC instance allowed in a platform for clean
> > + * implementation of SMP IRQ affinity and per-CPU IPIs.
> > + *
> > + * This means on a multi-socket (or multi-die) platform we
> > + * will have multiple MMIO regions for one IMSIC instance.
> > + */
> > + if (imsic) {
> > + pr_err("%pfwP: already initialized hence ignoring\n",
> > + fwnode);
>
> Nit: 100 chars
>
> > + return -EALREADY;
> > + }
> > +
> > + if (!riscv_isa_extension_available(NULL, SxAIA)) {
> > + pr_err("%pfwP: AIA support not available\n", fwnode);
> > + return -ENODEV;
> > + }
> > +
> > + imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
> > + if (!imsic)
> > + return -ENOMEM;
> > + imsic->fwnode = fwnode;
> > + global = &imsic->global;
> > +
> > + global->local = alloc_percpu(typeof(*(global->local)));
> > + if (!global->local) {
> > + rc = -ENOMEM;
> > + goto out_free_priv;
> > + }
> > +
> > + /* Parse IMSIC fwnode */
> > + rc = imsic_parse_fwnode(fwnode, global, &nr_parent_irqs, &nr_mmios);
> > + if (rc)
> > + goto out_free_local;
> > +
> > + /* Allocate MMIO resource array */
> > + mmios = kcalloc(nr_mmios, sizeof(*mmios), GFP_KERNEL);
> > + if (!mmios) {
> > + rc = -ENOMEM;
> > + goto out_free_local;
> > + }
> > +
> > + /* Allocate MMIO virtual address array */
> > + mmios_va = kcalloc(nr_mmios, sizeof(*mmios_va), GFP_KERNEL);
> > + if (!mmios_va) {
> > + rc = -ENOMEM;
> > + goto out_iounmap;
> > + }
> > +
> > + /* Parse and map MMIO register sets */
> > + for (i = 0; i < nr_mmios; i++) {
> > + rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
> > + if (rc) {
> > + pr_err("%pfwP: unable to parse MMIO regset %d\n",
> > + fwnode, i);
>
> Nit: 100 chars
>
> > + goto out_iounmap;
> > + }
> > +
> > + base_addr = mmios[i].start;
> > + base_addr &= ~(BIT(global->guest_index_bits +
> > + global->hart_index_bits +
> > + IMSIC_MMIO_PAGE_SHIFT) - 1);
> > + base_addr &= ~((BIT(global->group_index_bits) - 1) <<
> > + global->group_index_shift);
> > + if (base_addr != global->base_addr) {
> > + rc = -EINVAL;
> > + pr_err("%pfwP: address mismatch for regset %d\n",
> > + fwnode, i);
>
> Nit: 100 chars... and all the places below where applicable.
>
>
> Björn

Regards,
Anup

2024-02-20 13:09:58

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 00/13] Linux RISC-V AIA Support

On Tue, Feb 20, 2024 at 5:22 PM Björn Töpel <bjorn@kernelorg> wrote:
>
> Anup Patel <[email protected]> writes:
>
> > The RISC-V AIA specification is ratified as-per the RISC-V international
> > process. The latest ratified AIA specifcation can be found at:
> > https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
> >
> > At a high-level, the AIA specification adds three things:
> > 1) AIA CSRs
> > - Improved local interrupt support
> > 2) Incoming Message Signaled Interrupt Controller (IMSIC)
> > - Per-HART MSI controller
> > - Support MSI virtualization
> > - Support IPI along with virtualization
> > 3) Advanced Platform-Level Interrupt Controller (APLIC)
> > - Wired interrupt controller
> > - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
> > - In Direct-mode, injects external interrupts directly into HARTs
> >
> > For an overview of the AIA specification, refer the AIA virtualization
> > talk at KVM Forum 2022:
> > https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
> > https://www.youtube.com/watch?v=r071dL8Z0yo
> >
> > To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).
> >
> > This series depends upon per-device MSI domain patches merged by Thomas (tglx)
> > which are available in irq/msi branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> >
> > These patches can also be found in the riscv_aia_v13 branch at:
> > https://github.com/avpatel/linux.git
> >
> > Changes since v12:
> > - Rebased on Linux-6.8-rc5
> > - Dropped per-device MSI domain patches which are already merged by Thomas (tglx)
> > - Addressed nit comments from Thomas and Clement
> > - Added a new patch2 to fix lock dependency warning
> > - Replaced local sync IPI in the IMSIC driver with per-CPU timer
> > - Simplified locking in the IMSIC driver to avoid lock dependency issues
> > - Added a dirty bitmap in the IMSIC driver to optimize per-CPU local sync loop
>
> Thanks, Anup.
>
> I will take it for a spin, with Alex' v1 of the stop_machine()/ftrace
> IPI fix.
>
> The defconfig change (12/13)breaks a bunch a builds:
> https://patchwork.kernel.org/project/linux-riscv/list/?series=827706
>
> Download the logs here:
> https://github.com/linux-riscv/linux-riscv/suites/20917102160/logs?attempt=1
> and grep for '##[error]'

You need to pull-in 14 dependent patches which Thomas has merged
in his irq/msi branch at
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git.

Regards,
Anup

2024-02-20 13:16:11

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> The RISC-V advanced interrupt architecture (AIA) specification
> defines a new MSI controller called incoming message signalled
> interrupt controller (IMSIC) which manages MSI on per-HART (or
> per-CPU) basis. It also supports IPIs as software injected MSIs.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> Let us add an early irqchip driver for RISC-V IMSIC which sets
> up the IMSIC state and provide IPIs.

s/Let us add/Add/

> +#else
> +static void imsic_ipi_starting_cpu(void)
> +{
> +}
> +
> +static void imsic_ipi_dying_cpu(void)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(void)
> +{
> + return 0;
> +}

Please condense this into

static void imsic_ipi_starting_cpu(void) { }
static void imsic_ipi_dying_cpu(void) { }
static int __init imsic_ipi_domain_init(void) { return 0; }

No point in wasting space for these stubs.

> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + int err, cpu = smp_processor_id();
> + struct imsic_vector *vec;
> + unsigned long local_id;
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> + local_id = local_id >> TOPEI_ID_SHIFT;
> +
> + if (local_id == IMSIC_IPI_ID) {
> +#ifdef CONFIG_SMP

if (IS_ENABLED(CONFIG_SMP))

> + ipi_mux_process();
> +#endif
> + continue;
> + }

> +
> +/* MUST be called with lpriv->lock held */

Instead of a comment which cannot be enforced just have

lockdep_assert_held(&lpriv->lock);

right at the top of the function. That documents the requirement and
lets lockdep yell if not followed.

> +#ifdef CONFIG_SMP
> +static void imsic_local_timer_callback(struct timer_list *timer)
> +{
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + __imsic_local_sync(lpriv);
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +}
> +
> +/* MUST be called with lpriv->lock held */

Ditto

> +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)

> +void imsic_vector_mask(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;

WARN_ON_ONCE(), no?

> +bool imsic_vector_isenabled(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + unsigned long flags;
> + bool ret;
> +
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + ret = vec->enable;
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);

I'm not sure what you are trying to protect here. vec->enable can
obviously change right after the lock is dropped. So that's just a
snapshot, which is not any better than using

READ_ONCE(vec->enable);

and a corresponding WRITE_ONCE() at the update site, which obviously
needs serialization.

> +static void __init imsic_local_cleanup(void)
> +{
> + int cpu;
> + struct imsic_local_priv *lpriv;

struct imsic_local_priv *lpriv;
int cpu;

Please.

> +void imsic_state_offline(void)
> +{
> +#ifdef CONFIG_SMP
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> +#endif

You can move that into the #ifdef below.

> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> + irq_matrix_offline(imsic->matrix);
> + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> +
> +#ifdef CONFIG_SMP
> + raw_spin_lock_irqsave(&lpriv->lock, flags);
> + WARN_ON_ONCE(try_to_del_timer_sync(&lpriv->timer) < 0);
> + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> +#endif
> +}


Thanks,

tglx

2024-02-20 13:16:33

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Feb 20, 2024 at 5:23 PM Björn Töpel <bjorn@kernelorg> wrote:
>
> Anup Patel <[email protected]> writes:
>
> > The RISC-V advanced interrupt architecture (AIA) specification
> > defines a new MSI controller called incoming message signalled
> > interrupt controller (IMSIC) which manages MSI on per-HART (or
> > per-CPU) basis. It also supports IPIs as software injected MSIs.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > Let us add an early irqchip driver for RISC-V IMSIC which sets
> > up the IMSIC state and provide IPIs.
> >
> > Signed-off-by: Anup Patel <[email protected]>
>
> This patch has a couple of checkpatch issues:
>
> CHECK: Alignment should match open parenthesis
> CHECK: Please don't use multiple blank lines
> CHECK: Please use a blank line after function/struct/union/enum declarations
> CHECK: Unnecessary parentheses around 'global->nr_guest_ids < IMSIC_MIN_ID'
> CHECK: Unnecessary parentheses around 'global->nr_guest_ids >= IMSIC_MAX_ID'
> CHECK: Unnecessary parentheses around 'global->nr_ids < IMSIC_MIN_ID'
> CHECK: Unnecessary parentheses around 'global->nr_ids >= IMSIC_MAX_ID'
> CHECK: Unnecessary parentheses around global->local
> CHECK: Unnecessary parentheses around imsic->lpriv
> CHECK: extern prototypes should be avoided in .h files
>

Okay, I will address these in the next revision.

Regards,
Anup

2024-02-20 13:32:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> +#ifdef CONFIG_SMP
> +static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
> +{
> + struct msi_msg msg[2] = { [1] = { }, };

That's a weird initializer and why do you need an array here?

struct msi_msg msg = { };

Should be sufficient, no?

> +
> + imsic_irq_compose_vector_msg(vec, msg);
> + irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
> +}

> +static int imsic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> + unsigned int nr_irqs, void *args)
> +{
> + struct imsic_vector *vec;
> +
> + /* Legacy-MSI or multi-MSI not supported yet. */

Coming back to my earlier question:

>> What's legacy MSI in that context?
>
> The legacy-MSI is the MSI support in PCI v2.2 where
> number of MSIs allocated by device were either 1, 2, 4,
> 8, 16, or 32 and the data written is <data_word> + <irqnum>.

You talk about PCI/MSI, where more than one vector is named
multi-MSI. Contrary to the modern v3.0 variant which is PCI/MSI-X.

So this should be "Multi-MSI is not supported yet", no?

> + if (nr_irqs > 1)
> + return -ENOTSUPP;
> +
> + vec = imsic_vector_alloc(virq, cpu_online_mask);
> + if (!vec)
> + return -ENOSPC;
> +
> + irq_domain_set_info(domain, virq, virq,
> + &imsic_irq_base_chip, vec,
> + handle_simple_irq, NULL, NULL);

Please utilize the 100 characters.

> + irq_set_noprobe(virq);
> + irq_set_affinity(virq, cpu_online_mask);
> +
> + return 0;
> +}

Thanks,

tglx

2024-02-20 13:35:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 08/13] irqchip/riscv-imsic: Add device MSI domain support for PCI devices

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> static bool imsic_init_dev_msi_info(struct device *dev,
> struct irq_domain *domain,
> struct irq_domain *real_parent,
> @@ -218,6 +241,7 @@ static bool imsic_init_dev_msi_info(struct device *dev,
>
> /* MSI parent domain specific settings */
> switch (real_parent->bus_token) {
> + case DOMAIN_BUS_PCI_MSI:

case DOMAIN_BUS_PCI_DEVICE_MSIX:

?

Thanks,

tglx

2024-02-20 13:41:18

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 10/13] irqchip: Add RISC-V advanced PLIC driver for direct-mode

On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> +/*
> + * To handle an APLIC direct interrupts, we just read the CLAIMI register
> + * which will return highest priority pending interrupt and clear the
> + * pending bit of the interrupt. This process is repeated until CLAIMI
> + * register return zero value.
> + */
> +static void aplic_direct_handle_irq(struct irq_desc *desc)
> +{
> + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> + struct irq_domain *irqdomain = idc->direct->irqdomain;
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + irq_hw_number_t hw_irq;
> + int irq;
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> + irq = irq_find_mapping(irqdomain, hw_irq);
> +
> + if (unlikely(irq <= 0))
> + dev_warn_ratelimited(idc->direct->priv.dev,
> + "hw_irq %lu mapping not found\n", hw_irq);

Lacks brackets. See Documentation....

> + else
> + generic_handle_irq(irq);
> + }
> +
> +static int aplic_direct_starting_cpu(unsigned int cpu)
> +{
> + if (aplic_direct_parent_irq)
> + enable_percpu_irq(aplic_direct_parent_irq,
> + irq_get_trigger_type(aplic_direct_parent_irq));

Ditto.

> + return 0;
> +}

> +void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode)
> +{
> + u32 val;
> +#ifdef CONFIG_RISCV_M_MODE
> + u32 valH;

No camel case please.

> +
> + if (msi_mode) {
> + val = lower_32_bits(priv->msicfg.base_ppn);
> + valH = FIELD_PREP(APLIC_xMSICFGADDRH_BAPPN, upper_32_bits(priv->msicfg.base_ppn));
> + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXW, priv->msicfg.lhxw);
> + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXW, priv->msicfg.hhxw);
> + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXS, priv->msicfg.lhxs);
> + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXS, priv->msicfg.hhxs);
> + writel(val, priv->regs + APLIC_xMSICFGADDR);
> + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> + }

Thanks,

tglx

2024-02-20 16:40:10

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

On Tue, Feb 20, 2024 at 5:23 PM Björn Töpel <bjorn@kernelorg> wrote:
>
> Anup Patel <[email protected]> writes:
>
> > The Linux platform MSI support allows per-device MSI domains so let
> > us add a platform irqchip driver for RISC-V IMSIC which provides a
> > base IRQ domain with MSI parent support for platform device domains.
> >
> > This driver assumes that the IMSIC state is already initialized by
> > the IMSIC early driver.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Makefile | 2 +-
> > drivers/irqchip/irq-riscv-imsic-platform.c | 346 +++++++++++++++++++++
> > drivers/irqchip/irq-riscv-imsic-state.h | 1 +
> > 3 files changed, 348 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c
> >
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index d714724387ce..abca445a3229 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -95,7 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > -obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> > +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
> > new file mode 100644
> > index 000000000000..7ee44c493dbc
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic-platform.c
> > @@ -0,0 +1,346 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/bitmap.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +
> > +#include "irq-riscv-imsic-state.h"
> > +
> > +static bool imsic_cpu_page_phys(unsigned int cpu, unsigned int guest_index,
> > + phys_addr_t *out_msi_pa)
> > +{
> > + struct imsic_global_config *global;
> > + struct imsic_local_config *local;
> > +
> > + global = &imsic->global;
> > + local = per_cpu_ptr(global->local, cpu);
> > +
> > + if (BIT(global->guest_index_bits) <= guest_index)
> > + return false;
> > +
> > + if (out_msi_pa)
> > + *out_msi_pa = local->msi_pa +
> > + (guest_index * IMSIC_MMIO_PAGE_SZ);
> > +
> > + return true;
> > +}
> > +
> > +static void imsic_irq_mask(struct irq_data *d)
> > +{
> > + imsic_vector_mask(irq_data_get_irq_chip_data(d));
> > +}
> > +
> > +static void imsic_irq_unmask(struct irq_data *d)
> > +{
> > + imsic_vector_unmask(irq_data_get_irq_chip_data(d));
> > +}
> > +
> > +static int imsic_irq_retrigger(struct irq_data *d)
> > +{
> > + struct imsic_vector *vec = irq_data_get_irq_chip_data(d);
> > + struct imsic_local_config *local;
> > +
> > + if (WARN_ON(vec == NULL))
>
> Checkpatch: use !vec

Okay, I will update.

>
> > + return -ENOENT;
> > +
> > + local = per_cpu_ptr(imsic->global.local, vec->cpu);
> > + writel_relaxed(vec->local_id, local->msi_va);
> > + return 0;
> > +}
> > +
> > +static void imsic_irq_compose_vector_msg(struct imsic_vector *vec, struct msi_msg *msg)
> > +{
> > + phys_addr_t msi_addr;
> > +
> > + if (WARN_ON(vec == NULL))
>
> Checkpatch: use !vec

Okay, I will update.

>
> > + return;
> > +
> > + if (WARN_ON(!imsic_cpu_page_phys(vec->cpu, 0, &msi_addr)))
> > + return;
> > +
> > + msg->address_hi = upper_32_bits(msi_addr);
> > + msg->address_lo = lower_32_bits(msi_addr);
> > + msg->data = vec->local_id;
> > +}
> > +
> > +static void imsic_irq_compose_msg(struct irq_data *d, struct msi_msg *msg)
> > +{
> > + imsic_irq_compose_vector_msg(irq_data_get_irq_chip_data(d), msg);
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
> > +{
> > + struct msi_msg msg[2] = { [1] = { }, };
> > +
> > + imsic_irq_compose_vector_msg(vec, msg);
> > + irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
> > +}
> > +
> > +static int imsic_irq_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> > + bool force)
> > +{
> > + struct imsic_vector *old_vec, *new_vec;
> > + struct irq_data *pd = d->parent_data;
> > +
> > + old_vec = irq_data_get_irq_chip_data(pd);
> > + if (WARN_ON(old_vec == NULL))
>
> Checkpatch: use !old_vec

Okay, I will update.

>
> > + return -ENOENT;
> > +
> > + /* If old vector cpu belongs to the target cpumask then do nothing */
> > + if (cpumask_test_cpu(old_vec->cpu, mask_val))
> > + return IRQ_SET_MASK_OK_DONE;
> > +
> > + /* If move is already in-flight then return failure */
> > + if (imsic_vector_get_move(old_vec))
> > + return -EBUSY;
> > +
> > + /* Get a new vector on the desired set of CPUs */
> > + new_vec = imsic_vector_alloc(old_vec->hwirq, mask_val);
> > + if (!new_vec)
> > + return -ENOSPC;
> > +
> > + /* Point device to the new vector */
> > + imsic_msi_update_msg(d, new_vec);
> > +
> > + /* Update irq descriptors with the new vector */
> > + pd->chip_data = new_vec;
> > +
> > + /* Update effective affinity of parent irq data */
> > + irq_data_update_effective_affinity(pd, cpumask_of(new_vec->cpu));
> > +
> > + /* Move state of the old vector to the new vector */
> > + imsic_vector_move(old_vec, new_vec);
> > +
> > + return IRQ_SET_MASK_OK_DONE;
> > +}
> > +#endif
> > +
> > +static struct irq_chip imsic_irq_base_chip = {
> > + .name = "IMSIC",
> > + .irq_mask = imsic_irq_mask,
> > + .irq_unmask = imsic_irq_unmask,
> > + .irq_retrigger = imsic_irq_retrigger,
> > + .irq_compose_msi_msg = imsic_irq_compose_msg,
> > + .flags = IRQCHIP_SKIP_SET_WAKE |
> > + IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> > + unsigned int nr_irqs, void *args)
> > +{
> > + struct imsic_vector *vec;
> > +
> > + /* Legacy-MSI or multi-MSI not supported yet. */
> > + if (nr_irqs > 1)
> > + return -ENOTSUPP;
>
> Checkpatch: WARNING: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP

Okay, I will update.

Regards,
Anup

2024-02-20 16:45:46

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Feb 20, 2024 at 6:45 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > The RISC-V advanced interrupt architecture (AIA) specification
> > defines a new MSI controller called incoming message signalled
> > interrupt controller (IMSIC) which manages MSI on per-HART (or
> > per-CPU) basis. It also supports IPIs as software injected MSIs.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > Let us add an early irqchip driver for RISC-V IMSIC which sets
> > up the IMSIC state and provide IPIs.
>
> s/Let us add/Add/

Okay, I will update.

>
> > +#else
> > +static void imsic_ipi_starting_cpu(void)
> > +{
> > +}
> > +
> > +static void imsic_ipi_dying_cpu(void)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(void)
> > +{
> > + return 0;
> > +}
>
> Please condense this into
>
> static void imsic_ipi_starting_cpu(void) { }
> static void imsic_ipi_dying_cpu(void) { }
> static int __init imsic_ipi_domain_init(void) { return 0; }
>
> No point in wasting space for these stubs.

Sure, I will update.

>
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + int err, cpu = smp_processor_id();
> > + struct imsic_vector *vec;
> > + unsigned long local_id;
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> > + local_id = local_id >> TOPEI_ID_SHIFT;
> > +
> > + if (local_id == IMSIC_IPI_ID) {
> > +#ifdef CONFIG_SMP
>
> if (IS_ENABLED(CONFIG_SMP))

Okay, I will update.

>
> > + ipi_mux_process();
> > +#endif
> > + continue;
> > + }
>
> > +
> > +/* MUST be called with lpriv->lock held */
>
> Instead of a comment which cannot be enforced just have
>
> lockdep_assert_held(&lpriv->lock);
>
> right at the top of the function. That documents the requirement and
> lets lockdep yell if not followed.
>
> > +#ifdef CONFIG_SMP
> > +static void imsic_local_timer_callback(struct timer_list *timer)
> > +{
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + __imsic_local_sync(lpriv);
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +}
> > +
> > +/* MUST be called with lpriv->lock held */
>
> Ditto
>
> > +static void __imsic_remote_sync(struct imsic_local_priv *lpriv, unsigned int cpu)
>
> > +void imsic_vector_mask(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
>
> WARN_ON_ONCE(), no?
>
> > +bool imsic_vector_isenabled(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + unsigned long flags;
> > + bool ret;
> > +
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + ret = vec->enable;
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
>
> I'm not sure what you are trying to protect here. vec->enable can
> obviously change right after the lock is dropped. So that's just a
> snapshot, which is not any better than using
>
> READ_ONCE(vec->enable);
>
> and a corresponding WRITE_ONCE() at the update site, which obviously
> needs serialization.
>
> > +static void __init imsic_local_cleanup(void)
> > +{
> > + int cpu;
> > + struct imsic_local_priv *lpriv;
>
> struct imsic_local_priv *lpriv;
> int cpu;
>
> Please.
>
> > +void imsic_state_offline(void)
> > +{
> > +#ifdef CONFIG_SMP
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > +#endif
>
> You can move that into the #ifdef below.
>
> > + unsigned long flags;
> > +
> > + raw_spin_lock_irqsave(&imsic->matrix_lock, flags);
> > + irq_matrix_offline(imsic->matrix);
> > + raw_spin_unlock_irqrestore(&imsic->matrix_lock, flags);
> > +
> > +#ifdef CONFIG_SMP
> > + raw_spin_lock_irqsave(&lpriv->lock, flags);
> > + WARN_ON_ONCE(try_to_del_timer_sync(&lpriv->timer) < 0);
> > + raw_spin_unlock_irqrestore(&lpriv->lock, flags);
> > +#endif
> > +}
>
>
> Thanks,
>
> tglx

2024-02-20 16:52:23

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

On Tue, Feb 20, 2024 at 7:02 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > +#ifdef CONFIG_SMP
> > +static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
> > +{
> > + struct msi_msg msg[2] = { [1] = { }, };
>
> That's a weird initializer and why do you need an array here?
>
> struct msi_msg msg = { };
>
> Should be sufficient, no?

I had taken reference from irq_msi_update_msg() in
arch/x86/kernel/apic/msi.c

I tried "struct msi_msg msg = { };" and it works fine so
I will update.

>
> > +
> > + imsic_irq_compose_vector_msg(vec, msg);
> > + irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
> > +}
>
> > +static int imsic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> > + unsigned int nr_irqs, void *args)
> > +{
> > + struct imsic_vector *vec;
> > +
> > + /* Legacy-MSI or multi-MSI not supported yet. */
>
> Coming back to my earlier question:
>
> >> What's legacy MSI in that context?
> >
> > The legacy-MSI is the MSI support in PCI v2.2 where
> > number of MSIs allocated by device were either 1, 2, 4,
> > 8, 16, or 32 and the data written is <data_word> + <irqnum>.
>
> You talk about PCI/MSI, where more than one vector is named
> multi-MSI. Contrary to the modern v3.0 variant which is PCI/MSI-X.
>
> So this should be "Multi-MSI is not supported yet", no?

Yes, I agree. We should just call it "Multi-MSI is not supported yet"
to avoid confusion. I will update.

>
> > + if (nr_irqs > 1)
> > + return -ENOTSUPP;
> > +
> > + vec = imsic_vector_alloc(virq, cpu_online_mask);
> > + if (!vec)
> > + return -ENOSPC;
> > +
> > + irq_domain_set_info(domain, virq, virq,
> > + &imsic_irq_base_chip, vec,
> > + handle_simple_irq, NULL, NULL);
>
> Please utilize the 100 characters.

Okay, I will update.

>
> > + irq_set_noprobe(virq);
> > + irq_set_affinity(virq, cpu_online_mask);
> > +
> > + return 0;
> > +}
>
> Thanks,
>
> tglx

Thanks,
Anup

2024-02-20 17:12:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 07/13] irqchip/riscv-imsic: Add device MSI domain support for platform devices

On Tue, Feb 20 2024 at 22:22, Anup Patel wrote:
> On Tue, Feb 20, 2024 at 7:02 PM Thomas Gleixner <[email protected]> wrote:
>>
>> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
>> > +#ifdef CONFIG_SMP
>> > +static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
>> > +{
>> > + struct msi_msg msg[2] = { [1] = { }, };
>>
>> That's a weird initializer and why do you need an array here?
>>
>> struct msi_msg msg = { };
>>
>> Should be sufficient, no?
>
> I had taken reference from irq_msi_update_msg() in
> arch/x86/kernel/apic/msi.c

Which is equally bogus :)

The charm of copy and pasta...

Thanks,

tglx

2024-02-20 17:22:06

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 08/13] irqchip/riscv-imsic: Add device MSI domain support for PCI devices

On Tue, Feb 20, 2024 at 7:05 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > static bool imsic_init_dev_msi_info(struct device *dev,
> > struct irq_domain *domain,
> > struct irq_domain *real_parent,
> > @@ -218,6 +241,7 @@ static bool imsic_init_dev_msi_info(struct device *dev,
> >
> > /* MSI parent domain specific settings */
> > switch (real_parent->bus_token) {
> > + case DOMAIN_BUS_PCI_MSI:
>
> case DOMAIN_BUS_PCI_DEVICE_MSIX:
>
> ?

Actually, the DOMAIN_BUS_PCI_MSI is not needed because
the real parent domain is always the IMSIC base irq_domain
so DOMAIN_BUS_NEXUS is sufficient.

Better to just drop DOMAIN_BUS_PCI_MSI from this switch case ?

Regards,
Anup

2024-02-20 20:04:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v13 08/13] irqchip/riscv-imsic: Add device MSI domain support for PCI devices

On Tue, Feb 20 2024 at 22:51, Anup Patel wrote:

> On Tue, Feb 20, 2024 at 7:05 PM Thomas Gleixner <[email protected]> wrote:
>>
>> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
>> > static bool imsic_init_dev_msi_info(struct device *dev,
>> > struct irq_domain *domain,
>> > struct irq_domain *real_parent,
>> > @@ -218,6 +241,7 @@ static bool imsic_init_dev_msi_info(struct device *dev,
>> >
>> > /* MSI parent domain specific settings */
>> > switch (real_parent->bus_token) {
>> > + case DOMAIN_BUS_PCI_MSI:
>>
>> case DOMAIN_BUS_PCI_DEVICE_MSIX:
>>
>> ?
>
> Actually, the DOMAIN_BUS_PCI_MSI is not needed because
> the real parent domain is always the IMSIC base irq_domain
> so DOMAIN_BUS_NEXUS is sufficient.

Indeed. Obviously I can't read.

> Better to just drop DOMAIN_BUS_PCI_MSI from this switch case ?

Yes. I actually would be a bug if that ends up as the real parent
domain.

Thanks,

tglx

2024-02-21 05:42:44

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 10/13] irqchip: Add RISC-V advanced PLIC driver for direct-mode

On Tue, Feb 20, 2024 at 7:10 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > +/*
> > + * To handle an APLIC direct interrupts, we just read the CLAIMI register
> > + * which will return highest priority pending interrupt and clear the
> > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > + * register return zero value.
> > + */
> > +static void aplic_direct_handle_irq(struct irq_desc *desc)
> > +{
> > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > + struct irq_domain *irqdomain = idc->direct->irqdomain;
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + irq_hw_number_t hw_irq;
> > + int irq;
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > + irq = irq_find_mapping(irqdomain, hw_irq);
> > +
> > + if (unlikely(irq <= 0))
> > + dev_warn_ratelimited(idc->direct->priv.dev,
> > + "hw_irq %lu mapping not found\n", hw_irq);
>
> Lacks brackets. See Documentation....

Okay, I will update.

>
> > + else
> > + generic_handle_irq(irq);
> > + }
> > +
> > +static int aplic_direct_starting_cpu(unsigned int cpu)
> > +{
> > + if (aplic_direct_parent_irq)
> > + enable_percpu_irq(aplic_direct_parent_irq,
> > + irq_get_trigger_type(aplic_direct_parent_irq));
>
> Ditto.

Okay, I will update.

>
> > + return 0;
> > +}
>
> > +void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode)
> > +{
> > + u32 val;
> > +#ifdef CONFIG_RISCV_M_MODE
> > + u32 valH;
>
> No camel case please.

Okay.

>
> > +
> > + if (msi_mode) {
> > + val = lower_32_bits(priv->msicfg.base_ppn);
> > + valH = FIELD_PREP(APLIC_xMSICFGADDRH_BAPPN, upper_32_bits(priv->msicfg.base_ppn));
> > + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXW, priv->msicfg.lhxw);
> > + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXW, priv->msicfg.hhxw);
> > + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_LHXS, priv->msicfg.lhxs);
> > + valH |= FIELD_PREP(APLIC_xMSICFGADDRH_HHXS, priv->msicfg.hhxs);
> > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > + }
>
> Thanks,
>
> tglx

Regards,
Anup

2024-02-21 11:59:35

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

Anup Patel <[email protected]> writes:

>> > +void imsic_vector_mask(struct imsic_vector *vec)
>> > +{
>> > + struct imsic_local_priv *lpriv;
>> > +
>> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
>> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
>> > + return;
>> > +
>> > + /*
>> > + * This function is called through Linux irq subsystem with
>> > + * irqs disabled so no need to save/restore irq flags.
>> > + */
>> > +
>> > + raw_spin_lock(&lpriv->lock);
>> > +
>> > + vec->enable = false;
>> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
>> > + __imsic_remote_sync(lpriv, vec->cpu);
>> > +
>> > + raw_spin_unlock(&lpriv->lock);
>> > +}
>>
>> Really nice that you're using a timer for the vector affinity change,
>> and got rid of the special/weird IMSIC/sync IPI. Can you really use a
>> timer for mask/unmask? That makes the mask/unmask operation
>> asynchronous!
>>
>> That was what I was trying to get though with this comment:
>> https://lore.kernel.org/linux-riscv/[email protected]/
>>
>> Also, using the smp_* IPI functions, you can pass arguments, so you
>> don't need the dirty_bitmap tracking the changes.
>
> The mask/unmask operations are called with irqs disabled so if
> CPU X does synchronous IPI to another CPU Y from mask/unmask
> operation then while CPU X is waiting for IPI to complete it cannot
> receive IPI from other CPUs which can lead to crashes and stalls.
>
> In general, we should not do any block/busy-wait work in
> mask/unmask operation of an irqchip driver.

Hmm, OK. Still, a bit odd that when the .irq_mask callback return, the
masking is not actually completed.

1. CPU 0 tries to mask an interrupt tried to CPU 1.
2. The timer is queued on CPU 1.
3. The call irq_mask returns on CPU 0
4. ...the irq is masked at some future point, determined by the callback
at CPU 1

Is that the expected outcome?

There are .irq_mask implementation that does seem to go at length
(blocking) to perform the mask, e.g.: gic_mask_irq() which calls
gic_{re,}dist_wait_for_rwp that have sleep/retry loops. The GIC3 ITS
code has similar things going on.

I'm not saying you're wrong, I'm just trying to wrap my head around the
masking semantics.

> The AIA IMSIC spec allows setting ID pending bit using MSI write
> irrespective whether ID is enabled or not but the interrupt will be
> taken only after ID is enabled. In other words, there will be no
> loss of interrupt with delayed mask/unmask using async IPI or
> lazy timer.

No loss, but we might *get* an interrupt when we explicitly asked not to
get any. Maybe that's ok?


Björn

2024-02-21 12:24:24

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

On Wed, Feb 21, 2024 at 5:29 PM Björn Töpel <bjorn@kernelorg> wrote:
>
> Anup Patel <[email protected]> writes:
>
> >> > +void imsic_vector_mask(struct imsic_vector *vec)
> >> > +{
> >> > + struct imsic_local_priv *lpriv;
> >> > +
> >> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> >> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> >> > + return;
> >> > +
> >> > + /*
> >> > + * This function is called through Linux irq subsystem with
> >> > + * irqs disabled so no need to save/restore irq flags.
> >> > + */
> >> > +
> >> > + raw_spin_lock(&lpriv->lock);
> >> > +
> >> > + vec->enable = false;
> >> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
> >> > + __imsic_remote_sync(lpriv, vec->cpu);
> >> > +
> >> > + raw_spin_unlock(&lpriv->lock);
> >> > +}
> >>
> >> Really nice that you're using a timer for the vector affinity change,
> >> and got rid of the special/weird IMSIC/sync IPI. Can you really use a
> >> timer for mask/unmask? That makes the mask/unmask operation
> >> asynchronous!
> >>
> >> That was what I was trying to get though with this comment:
> >> https://lore.kernel.org/linux-riscv/[email protected]/
> >>
> >> Also, using the smp_* IPI functions, you can pass arguments, so you
> >> don't need the dirty_bitmap tracking the changes.
> >
> > The mask/unmask operations are called with irqs disabled so if
> > CPU X does synchronous IPI to another CPU Y from mask/unmask
> > operation then while CPU X is waiting for IPI to complete it cannot
> > receive IPI from other CPUs which can lead to crashes and stalls.
> >
> > In general, we should not do any block/busy-wait work in
> > mask/unmask operation of an irqchip driver.
>
> Hmm, OK. Still, a bit odd that when the .irq_mask callback return, the
> masking is not actually completed.
>
> 1. CPU 0 tries to mask an interrupt tried to CPU 1.
> 2. The timer is queued on CPU 1.
> 3. The call irq_mask returns on CPU 0
> 4. ...the irq is masked at some future point, determined by the callback
> at CPU 1
>
> Is that the expected outcome?

Yes, that's right.

>
> There are .irq_mask implementation that does seem to go at length
> (blocking) to perform the mask, e.g.: gic_mask_irq() which calls
> gic_{re,}dist_wait_for_rwp that have sleep/retry loops. The GIC3 ITS
> code has similar things going on.

The gic_{re,}dist_wait_for_rwp() polls on a HW register for completion
which will certainly complete in a predictable time whereas waiting
for IPI to be executed by another CPU is not predictable and fragile.

>
> I'm not saying you're wrong, I'm just trying to wrap my head around the
> masking semantics.
>
> > The AIA IMSIC spec allows setting ID pending bit using MSI write
> > irrespective whether ID is enabled or not but the interrupt will be
> > taken only after ID is enabled. In other words, there will be no
> > loss of interrupt with delayed mask/unmask using async IPI or
> > lazy timer.
>
> No loss, but we might *get* an interrupt when we explicitly asked not to
> get any. Maybe that's ok?
>

The delayed spurious interrupt after masking is avoided by additional
masking at the source of interrupt. For wired-to-MSI interrupts, we have
additional masking on the APLIC MSI-mode. For PCI MSI interrupts, we
have additional masking at PCI device level using pci_msi_mask_irq().

Regards,
Anup

2024-02-21 13:37:16

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 02/13] irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore

On Tue, Feb 20, 2024 at 3:41 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > Now that PLIC driver is probed as a regular platform driver, the lock
> > dependency validator complains about the safety of handler->enable_lock
> > usage:
> >
> > [ 0.956775] Possible interrupt unsafe locking scenario:
> >
> > [ 0.956998] CPU0 CPU1
> > [ 0.957247] ---- ----
> > [ 0.957439] lock(&handler->enable_lock);
> > [ 0.957607] local_irq_disable();
> > [ 0.957793] lock(&irq_desc_lock_class);
> > [ 0.958021] lock(&handler->enable_lock);
> > [ 0.958246] <Interrupt>
> > [ 0.958342] lock(&irq_desc_lock_class);
> > [ 0.958501]
> > *** DEADLOCK ***
> >
> > To address above, let's use raw_spin_lock_irqsave/unlock_irqrestore()
> > instead of raw_spin_lock/unlock().
>
> s/let's//

Okay, I will update.

Regards,
Anup

2024-02-21 13:49:24

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 03/13] irqchip/riscv-intc: Add support for RISC-V AIA

On Tue, Feb 20, 2024 at 3:43 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
>
> > The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> > local interrupts in following ways:
> > 1. Minimum 64 local interrupts for both RV32 and RV64
> > 2. Ability to process multiple pending local interrupts in same
> > interrupt handler
> > 3. Priority configuration for each local interrupts
> > 4. Special CSRs to configure/access the per-HART MSI controller
> >
> > We add support for #1 and #2 described above in the RISC-V intc
> > driver.
>
> S/We add/Add/

Okay, I will update.

>
> > +static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
> > +{
> > + unsigned long topi;
> > +
> > + while ((topi = csr_read(CSR_TOPI)))
> > + generic_handle_domain_irq(intc_domain,
> > + topi >> TOPI_IID_SHIFT);
>
> Please let it stick out. You got 100 characters. All over the place.

Okay, I will update.

Regards,
Anup

2024-02-21 17:22:49

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v13 06/13] irqchip: Add RISC-V incoming MSI controller early driver

Anup Patel <[email protected]> writes:

> On Wed, Feb 21, 2024 at 5:29 PM Björn Töpel <[email protected]> wrote:
>>
>> Anup Patel <[email protected]> writes:
>>
>> >> > +void imsic_vector_mask(struct imsic_vector *vec)
>> >> > +{
>> >> > + struct imsic_local_priv *lpriv;
>> >> > +
>> >> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
>> >> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
>> >> > + return;
>> >> > +
>> >> > + /*
>> >> > + * This function is called through Linux irq subsystem with
>> >> > + * irqs disabled so no need to save/restore irq flags.
>> >> > + */
>> >> > +
>> >> > + raw_spin_lock(&lpriv->lock);
>> >> > +
>> >> > + vec->enable = false;
>> >> > + bitmap_set(lpriv->dirty_bitmap, vec->local_id, 1);
>> >> > + __imsic_remote_sync(lpriv, vec->cpu);
>> >> > +
>> >> > + raw_spin_unlock(&lpriv->lock);
>> >> > +}
>> >>
>> >> Really nice that you're using a timer for the vector affinity change,
>> >> and got rid of the special/weird IMSIC/sync IPI. Can you really use a
>> >> timer for mask/unmask? That makes the mask/unmask operation
>> >> asynchronous!
>> >>
>> >> That was what I was trying to get though with this comment:
>> >> https://lore.kernel.org/linux-riscv/[email protected]/
>> >>
>> >> Also, using the smp_* IPI functions, you can pass arguments, so you
>> >> don't need the dirty_bitmap tracking the changes.
>> >
>> > The mask/unmask operations are called with irqs disabled so if
>> > CPU X does synchronous IPI to another CPU Y from mask/unmask
>> > operation then while CPU X is waiting for IPI to complete it cannot
>> > receive IPI from other CPUs which can lead to crashes and stalls.
>> >
>> > In general, we should not do any block/busy-wait work in
>> > mask/unmask operation of an irqchip driver.
>>
>> Hmm, OK. Still, a bit odd that when the .irq_mask callback return, the
>> masking is not actually completed.
>>
>> 1. CPU 0 tries to mask an interrupt tried to CPU 1.
>> 2. The timer is queued on CPU 1.
>> 3. The call irq_mask returns on CPU 0
>> 4. ...the irq is masked at some future point, determined by the callback
>> at CPU 1
>>
>> Is that the expected outcome?
>
> Yes, that's right.
>
>>
>> There are .irq_mask implementation that does seem to go at length
>> (blocking) to perform the mask, e.g.: gic_mask_irq() which calls
>> gic_{re,}dist_wait_for_rwp that have sleep/retry loops. The GIC3 ITS
>> code has similar things going on.
>
> The gic_{re,}dist_wait_for_rwp() polls on a HW register for completion
> which will certainly complete in a predictable time whereas waiting
> for IPI to be executed by another CPU is not predictable and fragile.
>
>>
>> I'm not saying you're wrong, I'm just trying to wrap my head around the
>> masking semantics.
>>
>> > The AIA IMSIC spec allows setting ID pending bit using MSI write
>> > irrespective whether ID is enabled or not but the interrupt will be
>> > taken only after ID is enabled. In other words, there will be no
>> > loss of interrupt with delayed mask/unmask using async IPI or
>> > lazy timer.
>>
>> No loss, but we might *get* an interrupt when we explicitly asked not to
>> get any. Maybe that's ok?
>>
>
> The delayed spurious interrupt after masking is avoided by additional
> masking at the source of interrupt. For wired-to-MSI interrupts, we have
> additional masking on the APLIC MSI-mode. For PCI MSI interrupts, we
> have additional masking at PCI device level using pci_msi_mask_irq().

Thanks for the clarifications, Anup! Much appreciated!

2024-02-22 09:35:50

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v13 01/13] irqchip/sifive-plic: Convert PLIC driver into a platform driver

On Tue, Feb 20, 2024 at 3:39 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Feb 20 2024 at 11:37, Anup Patel wrote:
> > The PLIC driver does not require very early initialization so let
> > us convert it into a platform driver.
>
> s/let us convert/convert/
>
> Please us passive voice and imperative mood all over the changelogs. No
> we/us, let....

Okay, I will update.

>
> > As part of the conversion, the PLIC probing undergoes the following
> > changes:
> > 1. Use dev_info(), dev_err() and dev_warn() instead of pr_info(),
> > pr_err() and pr_warn()
> > 2. Use devm_xyz() APIs wherever applicable
> > 3. PLIC is now probed after CPUs are brought-up so we have to
> > setup cpuhp state after context handler of all online CPUs
> > are initialized otherwise we see crash on multi-socket systems
>
> This patch is really doing too many things at once, which makes it hard
> to review. Can you split this into digestable pieces please?

Sure, I will split this into smaller granular patches.

>
> > if (unlikely(err))
> > - pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
> > + dev_warn_ratelimited(handler->priv->dev,
> > + "can't find mapping for hwirq %lu\n",
> > hwirq);
>
> Nit. Please use brackets around the condition. See:
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#bracket-rules
>
> for reasoning.

Okay, I will update.

Regards,
Anup