2023-10-23 17:28:42

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 00/14] Linux RISC-V AIA Support

The RISC-V AIA specification is ratified as-per the RISC-V international
process. The latest ratified AIA specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
- Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
- Per-HART MSI controller
- Support MSI virtualization
- Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
- Wired interrupt controller
- In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
- In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

These patches can also be found in the riscv_aia_v11 branch at:
https://github.com/avpatel/linux.git

Changes since v10:
- Rebased on Linux-6.6-rc7
- Dropped PATCH3 of v10 series since this has been merged by MarcZ
for Linux-6.6-rc7
- Changed the IMSIC ID management strategy from 1-n approach to
x86-style 1-1 approach

Changes since v9:
- Rebased on Linux-6.6-rc4
- Use builtin_platform_driver() in PATCH5, PATCH9, and PATCH12

Changes since v8:
- Rebased on Linux-6.6-rc3
- Dropped PATCH2 of v8 series since we won't be requiring
riscv_get_intc_hartid() based on Marc Z's comments on ACPI AIA support.
- Addressed Saravana's comments in PATCH3 of v8 series
- Update PATCH9 and PATCH13 of v8 series based on comments from Sunil

Changes since v7:
- Rebased on Linux-6.6-rc1
- Addressed comments on PATCH1 of v7 series and split it into two PATCHes
- Use DEFINE_SIMPLE_PROP() in PATCH2 of v7 series

Changes since v6:
- Rebased on Linux-6.5-rc4
- Updated PATCH2 to use IS_ENABLED(CONFIG_SPARC) instead of
!IS_ENABLED(CONFIG_OF_IRQ)
- Added new PATCH4 to fix syscore registration in PLIC driver
- Update PATCH5 to convert PLIC driver into full-blown platform driver
with a re-written probe function.

Changes since v5:
- Rebased on Linux-6.5-rc2
- Updated the overall series to ensure that only IPI, timer, and
INTC drivers are probed very early whereas rest of the interrupt
controllers (such as PLIC, APLIC, and IMISC) are probed as
regular platform drivers.
- Renamed riscv_fw_parent_hartid() to riscv_get_intc_hartid()
- New PATCH1 to add fw_devlink support for msi-parent DT property
- New PATCH2 to ensure all INTC suppliers are initialized which in-turn
fixes the probing issue for PLIC, APLIC and IMSIC as platform driver
- New PATCH3 to use platform driver probing for PLIC
- Re-structured the IMSIC driver into two separate drivers: early and
platform. The IMSIC early driver (PATCH7) only initialized IMSIC state
and provides IPIs whereas the IMSIC platform driver (PATCH8) is probed
provides MSI domain for platform devices.
- Re-structure the APLIC platform driver into three separe sources: main,
direct mode, and MSI mode.

Changes since v4:
- Rebased on Linux-6.5-rc1
- Added "Dependencies" in the APLIC bindings (PATCH6 in v4)
- Dropped the PATCH6 which was changing the IOMMU DMA domain APIs
- Dropped use of IOMMU DMA APIs in the IMSIC driver (PATCH4)

Changes since v3:
- Rebased on Linux-6.4-rc6
- Droped PATCH2 of v3 series instead we now set FWNODE_FLAG_BEST_EFFORT via
IRQCHIP_DECLARE()
- Extend riscv_fw_parent_hartid() to support both DT and ACPI in PATCH1
- Extend iommu_dma_compose_msi_msg() instead of adding iommu_dma_select_msi()
in PATCH6
- Addressed Conor's comments in PATCH3
- Addressed Conor's and Rob's comments in PATCH7

Changes since v2:
- Rebased on Linux-6.4-rc1
- Addressed Rob's comments on DT bindings patches 4 and 8.
- Addessed Marc's comments on IMSIC driver PATCH5
- Replaced use of OF apis in APLIC and IMSIC drivers with FWNODE apis
this makes both drivers easily portable for ACPI support. This also
removes unnecessary indirection from the APLIC and IMSIC drivers.
- PATCH1 is a new patch for portability with ACPI support
- PATCH2 is a new patch to fix probing in APLIC drivers for APLIC-only systems.
- PATCH7 is a new patch which addresses the IOMMU DMA domain issues pointed
out by SiFive

Changes since v1:
- Rebased on Linux-6.2-rc2
- Addressed comments on IMSIC DT bindings for PATCH4
- Use raw_spin_lock_irqsave() on ids_lock for PATCH5
- Improved MMIO alignment checks in PATCH5 to allow MMIO regions
with holes.
- Addressed comments on APLIC DT bindings for PATCH6
- Fixed warning splat in aplic_msi_write_msg() caused by
zeroed MSI message in PATCH7
- Dropped DT property riscv,slow-ipi instead will have module
parameter in future.

Anup Patel (14):
RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
of: property: Add fw_devlink support for msi-parent
irqchip/sifive-plic: Fix syscore registration for multi-socket systems
irqchip/sifive-plic: Convert PLIC driver into a platform driver
irqchip/riscv-intc: Add support for RISC-V AIA
dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
irqchip: Add RISC-V incoming MSI controller early driver
irqchip/riscv-imsic: Add support for platform MSI irqdomain
irqchip/riscv-imsic: Add support for PCI MSI irqdomain
dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
irqchip: Add RISC-V advanced PLIC driver for direct-mode
irqchip/riscv-aplic: Add support for MSI-mode
RISC-V: Select APLIC and IMSIC drivers
MAINTAINERS: Add entry for RISC-V AIA drivers

.../interrupt-controller/riscv,aplic.yaml | 172 ++++
.../interrupt-controller/riscv,imsics.yaml | 172 ++++
MAINTAINERS | 14 +
arch/riscv/Kconfig | 2 +
arch/riscv/kernel/cpu.c | 11 +-
drivers/irqchip/Kconfig | 24 +
drivers/irqchip/Makefile | 3 +
drivers/irqchip/irq-riscv-aplic-direct.c | 343 +++++++
drivers/irqchip/irq-riscv-aplic-main.c | 232 +++++
drivers/irqchip/irq-riscv-aplic-main.h | 53 +
drivers/irqchip/irq-riscv-aplic-msi.c | 285 ++++++
drivers/irqchip/irq-riscv-imsic-early.c | 235 +++++
drivers/irqchip/irq-riscv-imsic-platform.c | 360 +++++++
drivers/irqchip/irq-riscv-imsic-state.c | 962 ++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 110 ++
drivers/irqchip/irq-riscv-intc.c | 34 +-
drivers/irqchip/irq-sifive-plic.c | 242 +++--
drivers/of/property.c | 2 +
include/linux/irqchip/riscv-aplic.h | 119 +++
include/linux/irqchip/riscv-imsic.h | 87 ++
20 files changed, 3359 insertions(+), 103 deletions(-)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
create mode 100644 drivers/irqchip/irq-riscv-aplic-direct.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.h
create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
create mode 100644 include/linux/irqchip/riscv-aplic.h
create mode 100644 include/linux/irqchip/riscv-imsic.h

--
2.34.1


2023-10-23 17:28:46

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 03/14] irqchip/sifive-plic: Fix syscore registration for multi-socket systems

On multi-socket systems, we will have a separate PLIC in each socket
so we should register syscore operation only once for multi-socket
systems.

Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index e1484905b7bd..5b7bc4fd9517 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -532,17 +532,18 @@ static int __init __plic_init(struct device_node *node,
}

/*
- * We can have multiple PLIC instances so setup cpuhp state only
- * when context handler for current/boot CPU is present.
+ * We can have multiple PLIC instances so setup cpuhp state
+ * and register syscore operations only when context handler
+ * for current/boot CPU is present.
*/
handler = this_cpu_ptr(&plic_handlers);
if (handler->present && !plic_cpuhp_setup_done) {
cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
"irqchip/sifive/plic:starting",
plic_starting_cpu, plic_dying_cpu);
+ register_syscore_ops(&plic_irq_syscore_ops);
plic_cpuhp_setup_done = true;
}
- register_syscore_ops(&plic_irq_syscore_ops);

pr_info("%pOFP: mapped %d interrupts with %d handlers for"
" %d contexts.\n", node, nr_irqs, nr_handlers, nr_contexts);
--
2.34.1

2023-10-23 17:28:53

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 01/14] RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs

The riscv_of_processor_hartid() used by riscv_of_parent_hartid() fails
for HARTs disabled in the DT. This results in the following warning
thrown by the RISC-V INTC driver for the E-core on SiFive boards:

[ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller

The riscv_of_parent_hartid() is only expected to read the hartid from
the DT so we should directly call of_get_cpu_hwid() instead of calling
riscv_of_processor_hartid().

Fixes: ad635e723e17 ("riscv: cpu: Add 64bit hartid support on RV64")
Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Atish Patra <[email protected]>
---
arch/riscv/kernel/cpu.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index c17dacb1141c..157ace8b262c 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -125,13 +125,14 @@ int __init riscv_early_of_processor_hartid(struct device_node *node, unsigned lo
*/
int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid)
{
- int rc;
-
for (; node; node = node->parent) {
if (of_device_is_compatible(node, "riscv")) {
- rc = riscv_of_processor_hartid(node, hartid);
- if (!rc)
- return 0;
+ *hartid = (unsigned long)of_get_cpu_hwid(node, 0);
+ if (*hartid == ~0UL) {
+ pr_warn("Found CPU without hart ID\n");
+ return -ENODEV;
+ }
+ return 0;
}
}

--
2.34.1

2023-10-23 17:28:57

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 02/14] of: property: Add fw_devlink support for msi-parent

This allows fw_devlink to create device links between consumers of
a MSI and the supplier of the MSI.

Signed-off-by: Anup Patel <[email protected]>
Acked-by: Rob Herring <[email protected]>
Reviewed-by: Saravana Kannan <[email protected]>
---
drivers/of/property.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index cf8dacf3e3b8..afdaefbd03f6 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1267,6 +1267,7 @@ DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
DEFINE_SIMPLE_PROP(leds, "leds", NULL)
DEFINE_SIMPLE_PROP(backlight, "backlight", NULL)
DEFINE_SIMPLE_PROP(panel, "panel", NULL)
+DEFINE_SIMPLE_PROP(msi_parent, "msi-parent", "#msi-cells")
DEFINE_SUFFIX_PROP(regulators, "-supply", NULL)
DEFINE_SUFFIX_PROP(gpio, "-gpio", "#gpio-cells")

@@ -1356,6 +1357,7 @@ static const struct supplier_bindings of_supplier_bindings[] = {
{ .parse_prop = parse_leds, },
{ .parse_prop = parse_backlight, },
{ .parse_prop = parse_panel, },
+ { .parse_prop = parse_msi_parent, },
{ .parse_prop = parse_gpio_compat, },
{ .parse_prop = parse_interrupts, },
{ .parse_prop = parse_regulators, },
--
2.34.1

2023-10-23 17:29:12

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 05/14] irqchip/riscv-intc: Add support for RISC-V AIA

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

We add support for #1 and #2 described above in the RISC-V intc driver.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-riscv-intc.c | 34 ++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index e8d01b14ccdd..bab536bbaf2c 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/of.h>
#include <linux/smp.h>
+#include <asm/hwcap.h>

static struct irq_domain *intc_domain;

@@ -30,6 +31,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
generic_handle_domain_irq(intc_domain, cause);
}

+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+ unsigned long topi;
+
+ while ((topi = csr_read(CSR_TOPI)))
+ generic_handle_domain_irq(intc_domain,
+ topi >> TOPI_IID_SHIFT);
+}
+
/*
* On RISC-V systems local interrupts are masked or unmasked by writing
* the SIE (Supervisor Interrupt Enable) CSR. As CSRs can only be written
@@ -39,12 +49,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)

static void riscv_intc_irq_mask(struct irq_data *d)
{
- csr_clear(CSR_IE, BIT(d->hwirq));
+ if (IS_ENABLED(CONFIG_32BIT) && d->hwirq >= BITS_PER_LONG)
+ csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
+ else
+ csr_clear(CSR_IE, BIT(d->hwirq));
}

static void riscv_intc_irq_unmask(struct irq_data *d)
{
- csr_set(CSR_IE, BIT(d->hwirq));
+ if (IS_ENABLED(CONFIG_32BIT) && d->hwirq >= BITS_PER_LONG)
+ csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
+ else
+ csr_set(CSR_IE, BIT(d->hwirq));
}

static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,16 +131,20 @@ static struct fwnode_handle *riscv_intc_hwnode(void)

static int __init riscv_intc_init_common(struct fwnode_handle *fn)
{
- int rc;
+ int rc, nr_irqs = riscv_isa_extension_available(NULL, SxAIA) ?
+ 64 : BITS_PER_LONG;

- intc_domain = irq_domain_create_linear(fn, BITS_PER_LONG,
+ intc_domain = irq_domain_create_linear(fn, nr_irqs,
&riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
}

- rc = set_handle_irq(&riscv_intc_irq);
+ if (riscv_isa_extension_available(NULL, SxAIA))
+ rc = set_handle_irq(&riscv_intc_aia_irq);
+ else
+ rc = set_handle_irq(&riscv_intc_irq);
if (rc) {
pr_err("failed to set irq handler\n");
return rc;
@@ -132,7 +152,9 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)

riscv_set_intc_hwnode_fn(riscv_intc_hwnode);

- pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+ pr_info("%d local interrupts mapped%s\n",
+ nr_irqs, riscv_isa_extension_available(NULL, SxAIA) ?
+ " using AIA" : "");

return 0;
}
--
2.34.1

2023-10-23 17:29:23

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 06/14] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
---
.../interrupt-controller/riscv,imsics.yaml | 172 ++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..84976f17a4a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description: |
+ The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+ MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+ AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+ The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+ for each privilege level (machine or supervisor). The configuration of
+ a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+ space to receive MSIs from devices. Each IMSIC interrupt file supports a
+ fixed number of interrupt identities (to distinguish MSIs from devices)
+ which is same for given privilege level across CPUs (or HARTs).
+
+ The device tree of a RISC-V platform will have one IMSIC device tree node
+ for each privilege level (machine or supervisor) which collectively describe
+ IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+ The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+ follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+ group is a set of IMSIC interrupt files co-located in MMIO space and we can
+ have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+ RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+ privilege level (machine or supervisor) encodes group index, HART index,
+ and guest index (shown below).
+
+ XLEN-1 > (HART Index MSB) 12 0
+ | | | |
+ -------------------------------------------------------------
+ |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ -------------------------------------------------------------
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+ - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,imsics
+ - const: riscv,imsics
+
+ reg:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Base address of each IMSIC group.
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 0
+
+ msi-controller: true
+
+ "#msi-cells":
+ const: 0
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ This property represents the set of CPUs (or HARTs) for which given
+ device tree node describes the IMSIC interrupt files. Each node pointed
+ to should be a riscv,cpu-intc node, which has a CPU node (i.e. RISC-V
+ HART) as parent.
+
+ riscv,num-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities supported by IMSIC interrupt file.
+
+ riscv,num-guest-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities are supported by IMSIC guest interrupt
+ file. When not specified it is assumed to be same as specified by the
+ riscv,num-ids property.
+
+ riscv,guest-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of guest index bits in the MSI target address.
+
+ riscv,hart-index-bits:
+ minimum: 0
+ maximum: 15
+ description:
+ Number of HART index bits in the MSI target address. When not
+ specified it is calculated based on the interrupts-extended property.
+
+ riscv,group-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of group index bits in the MSI target address.
+
+ riscv,group-index-shift:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 0
+ maximum: 55
+ default: 24
+ description:
+ The least significant bit position of the group index bits in the
+ MSI target address.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - msi-controller
+ - "#msi-cells"
+ - interrupts-extended
+ - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (Machine-level IMSIC files with just one group):
+
+ interrupt-controller@24000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0x28000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ };
+
+ - |
+ // Example 2 (Supervisor-level IMSIC files with two groups):
+
+ interrupt-controller@28000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>,
+ <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+ <0x29000000 0x2000>; /* Group1 IMSICs */
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ riscv,group-index-bits = <1>;
+ riscv,group-index-shift = <24>;
+ };
+...
--
2.34.1

2023-10-23 17:30:39

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 11/14] irqchip: Add RISC-V advanced PLIC driver for direct-mode

The RISC-V advanced interrupt architecture (AIA) specification defines
advanced platform-level interrupt controller (APLIC) which has two modes
of operation: 1) Direct mode and 2) MSI mode.
(For more details, refer https://github.com/riscv/riscv-aia)

In APLIC direct-mode, wired interrupts are forwared to CPUs (or HARTs)
as a local external interrupt.

We add a platform irqchip driver for the RISC-V APLIC direct-mode to
support RISC-V platforms having only wired interrupts.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 5 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic-direct.c | 343 +++++++++++++++++++++++
drivers/irqchip/irq-riscv-aplic-main.c | 232 +++++++++++++++
drivers/irqchip/irq-riscv-aplic-main.h | 45 +++
include/linux/irqchip/riscv-aplic.h | 119 ++++++++
6 files changed, 745 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-aplic-direct.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.c
create mode 100644 drivers/irqchip/irq-riscv-aplic-main.h
create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index c1d69b418dfb..1996cc6f666a 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -546,6 +546,11 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_APLIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index abca445a3229..7f8289790ed8 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic-direct.c b/drivers/irqchip/irq-riscv-aplic-direct.c
new file mode 100644
index 000000000000..9ed2666bfb5e
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-direct.c
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/printk.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-aplic-main.h"
+
+#define APLIC_DISABLE_IDELIVERY 0
+#define APLIC_ENABLE_IDELIVERY 1
+#define APLIC_DISABLE_ITHRESHOLD 1
+#define APLIC_ENABLE_ITHRESHOLD 0
+
+struct aplic_direct {
+ struct aplic_priv priv;
+ struct irq_domain *irqdomain;
+ struct cpumask lmask;
+};
+
+struct aplic_idc {
+ unsigned int hart_index;
+ void __iomem *regs;
+ struct aplic_direct *direct;
+};
+
+static unsigned int aplic_direct_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_direct_irq_eoi(struct irq_data *d)
+{
+ /*
+ * The fasteoi_handler requires irq_eoi() callback hence
+ * provide a dummy handler.
+ */
+}
+
+#ifdef CONFIG_SMP
+static int aplic_direct_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val, bool force)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_direct *direct =
+ container_of(priv, struct aplic_direct, priv);
+ struct aplic_idc *idc;
+ unsigned int cpu, val;
+ struct cpumask amask;
+ void __iomem *target;
+
+ cpumask_and(&amask, &direct->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ writel(val, target);
+
+ irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_direct_chip = {
+ .name = "APLIC-DIRECT",
+ .irq_mask = aplic_irq_mask,
+ .irq_unmask = aplic_irq_unmask,
+ .irq_set_type = aplic_irq_set_type,
+ .irq_eoi = aplic_direct_irq_eoi,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = aplic_direct_set_affinity,
+#endif
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_direct_irqdomain_translate(struct irq_domain *d,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ struct aplic_priv *priv = d->host_data;
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base,
+ hwirq, type);
+}
+
+static int aplic_direct_irqdomain_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = domain->host_data;
+ struct aplic_direct *direct =
+ container_of(priv, struct aplic_direct, priv);
+
+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base,
+ &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &aplic_direct_chip, priv,
+ handle_fasteoi_irq, NULL, NULL);
+ irq_set_affinity(virq + i, &direct->lmask);
+ /* See the reason described in aplic_msi_irqdomain_alloc() */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_direct_irqdomain_ops = {
+ .translate = aplic_direct_irqdomain_translate,
+ .alloc = aplic_direct_irqdomain_alloc,
+ .free = irq_domain_free_irqs_top,
+};
+
+/*
+ * To handle an APLIC direct interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_direct_handle_irq(struct irq_desc *desc)
+{
+ struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct irq_domain *irqdomain = idc->direct->irqdomain;
+ irq_hw_number_t hw_irq;
+ int irq;
+
+ chained_irq_enter(chip, desc);
+
+ while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+ hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+ irq = irq_find_mapping(irqdomain, hw_irq);
+
+ if (unlikely(irq <= 0))
+ dev_warn_ratelimited(idc->direct->priv.dev,
+ "hw_irq %lu mapping not found\n",
+ hw_irq);
+ else
+ generic_handle_irq(irq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+ u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+ u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+ /* Priority must be less than threshold for interrupt triggering */
+ writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+ /* Delivery must be set to 1 for interrupt triggering */
+ writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_direct_dying_cpu(unsigned int cpu)
+{
+ if (aplic_direct_parent_irq)
+ disable_percpu_irq(aplic_direct_parent_irq);
+
+ return 0;
+}
+
+static int aplic_direct_starting_cpu(unsigned int cpu)
+{
+ if (aplic_direct_parent_irq)
+ enable_percpu_irq(aplic_direct_parent_irq,
+ irq_get_trigger_type(aplic_direct_parent_irq));
+
+ return 0;
+}
+
+static int aplic_direct_parse_parent_hwirq(struct device *dev,
+ u32 index, u32 *parent_hwirq,
+ unsigned long *parent_hartid)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(dev->fwnode), index, &parent);
+ if (rc)
+ return rc;
+
+ rc = riscv_of_parent_hartid(parent.np, parent_hartid);
+ if (rc)
+ return rc;
+
+ *parent_hwirq = parent.args[0];
+ return 0;
+}
+
+int aplic_direct_setup(struct device *dev, void __iomem *regs)
+{
+ int i, j, rc, cpu, setup_count = 0;
+ struct aplic_direct *direct;
+ struct aplic_priv *priv;
+ struct irq_domain *domain;
+ unsigned long hartid;
+ struct aplic_idc *idc;
+ u32 val, hwirq;
+
+ direct = kzalloc(sizeof(*direct), GFP_KERNEL);
+ if (!direct)
+ return -ENOMEM;
+ priv = &direct->priv;
+
+ rc = aplic_setup_priv(priv, dev, regs);
+ if (rc) {
+ dev_err(dev, "failed to create APLIC context\n");
+ kfree(direct);
+ return rc;
+ }
+
+ /* Setup per-CPU IDC and target CPU mask */
+ for (i = 0; i < priv->nr_idcs; i++) {
+ rc = aplic_direct_parse_parent_hwirq(dev, i, &hwirq, &hartid);
+ if (rc) {
+ dev_warn(dev, "parent irq for IDC%d not found\n", i);
+ continue;
+ }
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (hwirq != RV_IRQ_EXT)
+ continue;
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ dev_warn(dev, "invalid cpuid for IDC%d\n", i);
+ continue;
+ }
+
+ cpumask_set_cpu(cpu, &direct->lmask);
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ idc->hart_index = i;
+ idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+ idc->direct = direct;
+
+ aplic_idc_set_delivery(idc, true);
+
+ /*
+ * Boot cpu might not have APLIC hart_index = 0 so check
+ * and update target registers of all interrupts.
+ */
+ if (cpu == smp_processor_id() && idc->hart_index) {
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ for (j = 1; j <= priv->nr_irqs; j++)
+ writel(val, priv->regs + APLIC_TARGET_BASE +
+ (j - 1) * sizeof(u32));
+ }
+
+ setup_count++;
+ }
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!aplic_direct_parent_irq && domain) {
+ aplic_direct_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (aplic_direct_parent_irq) {
+ irq_set_chained_handler(aplic_direct_parent_irq,
+ aplic_direct_handle_irq);
+
+ /*
+ * Setup CPUHP notifier to enable parent
+ * interrupt on all CPUs
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/aplic:starting",
+ aplic_direct_starting_cpu,
+ aplic_direct_dying_cpu);
+ }
+ }
+
+ /* Fail if we were not able to setup IDC for any CPU */
+ if (!setup_count) {
+ kfree(direct);
+ return -ENODEV;
+ }
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv, false);
+
+ /* Create irq domain instance for the APLIC */
+ direct->irqdomain = irq_domain_create_linear(dev->fwnode,
+ priv->nr_irqs + 1,
+ &aplic_direct_irqdomain_ops,
+ priv);
+ if (!direct->irqdomain) {
+ dev_err(dev, "failed to create direct irq domain\n");
+ kfree(direct);
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ dev_info(dev, "%d interrupts directly connected to %d CPUs\n",
+ priv->nr_irqs, priv->nr_idcs);
+
+ return 0;
+}
diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
new file mode 100644
index 000000000000..87450708a733
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-main.c
@@ -0,0 +1,232 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/printk.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/irqchip/riscv-aplic.h>
+
+#include "irq-riscv-aplic-main.h"
+
+void aplic_irq_unmask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+}
+
+void aplic_irq_mask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+}
+
+int aplic_irq_set_type(struct irq_data *d, unsigned int type)
+{
+ u32 val = 0;
+ void __iomem *sourcecfg;
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ switch (type) {
+ case IRQ_TYPE_NONE:
+ val = APLIC_SOURCECFG_SM_INACTIVE;
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ val = APLIC_SOURCECFG_SM_EDGE_FALL;
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ val = APLIC_SOURCECFG_SM_EDGE_RISE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+ sourcecfg += (d->hwirq - 1) * sizeof(u32);
+ writel(val, sourcecfg);
+
+ return 0;
+}
+
+int aplic_irqdomain_translate(struct irq_fwspec *fwspec, u32 gsi_base,
+ unsigned long *hwirq, unsigned int *type)
+{
+ if (WARN_ON(fwspec->param_count < 2))
+ return -EINVAL;
+ if (WARN_ON(!fwspec->param[0]))
+ return -EINVAL;
+
+ /* For DT, gsi_base is always zero. */
+ *hwirq = fwspec->param[0] - gsi_base;
+ *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+ WARN_ON(*type == IRQ_TYPE_NONE);
+
+ return 0;
+}
+
+void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode)
+{
+ u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+ u32 valH;
+
+ if (msi_mode) {
+ val = priv->msicfg.base_ppn;
+ valH = ((u64)priv->msicfg.base_ppn >> 32) &
+ APLIC_xMSICFGADDRH_BAPPN_MASK;
+ valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+ << APLIC_xMSICFGADDRH_LHXW_SHIFT;
+ valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+ << APLIC_xMSICFGADDRH_HHXW_SHIFT;
+ valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+ << APLIC_xMSICFGADDRH_LHXS_SHIFT;
+ valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+ << APLIC_xMSICFGADDRH_HHXS_SHIFT;
+ writel(val, priv->regs + APLIC_xMSICFGADDR);
+ writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+ }
+#endif
+
+ /* Setup APLIC domaincfg register */
+ val = readl(priv->regs + APLIC_DOMAINCFG);
+ val |= APLIC_DOMAINCFG_IE;
+ if (msi_mode)
+ val |= APLIC_DOMAINCFG_DM;
+ writel(val, priv->regs + APLIC_DOMAINCFG);
+ if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+ dev_warn(priv->dev, "unable to write 0x%x in domaincfg\n",
+ val);
+}
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+ int i;
+
+ /* Disable all interrupts */
+ for (i = 0; i <= priv->nr_irqs; i += 32)
+ writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+ (i / 32) * sizeof(u32));
+
+ /* Set interrupt type and default priority for all interrupts */
+ for (i = 1; i <= priv->nr_irqs; i++) {
+ writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+ (i - 1) * sizeof(u32));
+ writel(APLIC_DEFAULT_PRIORITY,
+ priv->regs + APLIC_TARGET_BASE +
+ (i - 1) * sizeof(u32));
+ }
+
+ /* Clear APLIC domaincfg */
+ writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
+ void __iomem *regs)
+{
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ /* Save device pointer and register base */
+ priv->dev = dev;
+ priv->regs = regs;
+
+ /* Find out number of interrupt sources */
+ rc = of_property_read_u32(to_of_node(dev->fwnode),
+ "riscv,num-sources",
+ &priv->nr_irqs);
+ if (rc) {
+ dev_err(dev, "failed to get number of interrupt sources\n");
+ return rc;
+ }
+
+ /*
+ * Find out number of IDCs based on parent interrupts
+ *
+ * If "msi-parent" property is present then we ignore the
+ * APLIC IDCs which forces the APLIC driver to use MSI mode.
+ */
+ if (!of_property_present(to_of_node(dev->fwnode), "msi-parent")) {
+ while (!of_irq_parse_one(to_of_node(dev->fwnode),
+ priv->nr_idcs, &parent))
+ priv->nr_idcs++;
+ }
+
+ /* Setup initial state APLIC interrupts */
+ aplic_init_hw_irqs(priv);
+
+ return 0;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ bool msi_mode = false;
+ struct resource *res;
+ void __iomem *regs;
+ int rc;
+
+ /* Map the MMIO registers */
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(dev, "failed to get MMIO resource\n");
+ return -EINVAL;
+ }
+ regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
+ if (!regs) {
+ dev_err(dev, "failed map MMIO registers\n");
+ return -ENOMEM;
+ }
+
+ /*
+ * If msi-parent property is present then setup APLIC MSI
+ * mode otherwise setup APLIC direct mode.
+ */
+ if (is_of_node(dev->fwnode))
+ msi_mode = of_property_present(to_of_node(dev->fwnode),
+ "msi-parent");
+ if (msi_mode)
+ rc = -ENODEV;
+ else
+ rc = aplic_direct_setup(dev, regs);
+ if (rc) {
+ dev_err(dev, "failed setup APLIC in %s mode\n",
+ msi_mode ? "MSI" : "direct");
+ return rc;
+ }
+
+ return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+ { .compatible = "riscv,aplic" },
+ {}
+};
+
+static struct platform_driver aplic_driver = {
+ .driver = {
+ .name = "riscv-aplic",
+ .of_match_table = aplic_match,
+ },
+ .probe = aplic_probe,
+};
+builtin_platform_driver(aplic_driver);
diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
new file mode 100644
index 000000000000..474a04229334
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-main.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#ifndef _IRQ_RISCV_APLIC_MAIN_H
+#define _IRQ_RISCV_APLIC_MAIN_H
+
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
+#include <linux/fwnode.h>
+
+#define APLIC_DEFAULT_PRIORITY 1
+
+struct aplic_msicfg {
+ phys_addr_t base_ppn;
+ u32 hhxs;
+ u32 hhxw;
+ u32 lhxs;
+ u32 lhxw;
+};
+
+struct aplic_priv {
+ struct device *dev;
+ u32 gsi_base;
+ u32 nr_irqs;
+ u32 nr_idcs;
+ void __iomem *regs;
+ struct aplic_msicfg msicfg;
+};
+
+void aplic_irq_unmask(struct irq_data *d);
+void aplic_irq_mask(struct irq_data *d);
+int aplic_irq_set_type(struct irq_data *d, unsigned int type);
+int aplic_irqdomain_translate(struct irq_fwspec *fwspec, u32 gsi_base,
+ unsigned long *hwirq, unsigned int *type);
+void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
+int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
+ void __iomem *regs);
+int aplic_direct_setup(struct device *dev, void __iomem *regs);
+
+#endif
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..97e198ea0109
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC BIT(14)
+#define APLIC_MAX_SOURCE 1024
+
+#define APLIC_DOMAINCFG 0x0000
+#define APLIC_DOMAINCFG_RDONLY 0x80000000
+#define APLIC_DOMAINCFG_IE BIT(8)
+#define APLIC_DOMAINCFG_DM BIT(2)
+#define APLIC_DOMAINCFG_BE BIT(0)
+
+#define APLIC_SOURCECFG_BASE 0x0004
+#define APLIC_SOURCECFG_D BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
+#define APLIC_SOURCECFG_SM_MASK 0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE 0x0
+#define APLIC_SOURCECFG_SM_DETACH 0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
+
+#define APLIC_MMSICFGADDR 0x1bc0
+#define APLIC_MMSICFGADDRH 0x1bc4
+#define APLIC_SMSICFGADDR 0x1bc8
+#define APLIC_SMSICFGADDRH 0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
+#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
+#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
+#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT 12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+ (BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+ (BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+ ((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+ (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+ APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+ (BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+ ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+ (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+ APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_IRQBITS_PER_REG 32
+
+#define APLIC_SETIP_BASE 0x1c00
+#define APLIC_SETIPNUM 0x1cdc
+
+#define APLIC_CLRIP_BASE 0x1d00
+#define APLIC_CLRIPNUM 0x1ddc
+
+#define APLIC_SETIE_BASE 0x1e00
+#define APLIC_SETIENUM 0x1edc
+
+#define APLIC_CLRIE_BASE 0x1f00
+#define APLIC_CLRIENUM 0x1fdc
+
+#define APLIC_SETIPNUM_LE 0x2000
+#define APLIC_SETIPNUM_BE 0x2004
+
+#define APLIC_GENMSI 0x3000
+
+#define APLIC_TARGET_BASE 0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT 18
+#define APLIC_TARGET_HART_IDX_MASK 0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT 12
+#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
+#define APLIC_TARGET_IPRIO_MASK 0xff
+#define APLIC_TARGET_EIID_MASK 0x7ff
+
+#define APLIC_IDC_BASE 0x4000
+#define APLIC_IDC_SIZE 32
+
+#define APLIC_IDC_IDELIVERY 0x00
+
+#define APLIC_IDC_IFORCE 0x04
+
+#define APLIC_IDC_ITHRESHOLD 0x08
+
+#define APLIC_IDC_TOPI 0x18
+#define APLIC_IDC_TOPI_ID_SHIFT 16
+#define APLIC_IDC_TOPI_ID_MASK 0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK 0xff
+
+#define APLIC_IDC_CLAIMI 0x1c
+
+#endif
--
2.34.1

2023-10-23 17:31:00

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 10/14] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC

We add DT bindings document for RISC-V advanced platform level interrupt
controller (APLIC) defined by the RISC-V advanced interrupt architecture
(AIA) specification.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
.../interrupt-controller/riscv,aplic.yaml | 172 ++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..190a6499c932
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description:
+ The RISC-V advanced interrupt architecture (AIA) defines an advanced
+ platform level interrupt controller (APLIC) for handling wired interrupts
+ in a RISC-V platform. The RISC-V AIA specification can be found at
+ https://github.com/riscv/riscv-aia.
+
+ The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+ interrupt sources connect to the root APLIC domain and a parent APLIC
+ domain can delegate interrupt sources to it's child APLIC domains. There
+ is one device tree node for each APLIC domain.
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,aplic
+ - const: riscv,aplic
+
+ reg:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 2
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Given APLIC domain directly injects external interrupts to a set of
+ RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+ node, which has a CPU node (i.e. RISC-V HART) as parent.
+
+ msi-parent:
+ description:
+ Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+ message signaled interrupt controller (IMSIC). If both "msi-parent" and
+ "interrupts-extended" properties are present then it means the APLIC
+ domain supports both MSI mode and Direct mode in HW. In this case, the
+ APLIC driver has to choose between MSI mode or Direct mode.
+
+ riscv,num-sources:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 1023
+ description:
+ Specifies the number of wired interrupt sources supported by this
+ APLIC domain.
+
+ riscv,children:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ maxItems: 1
+ description:
+ A list of child APLIC domains for the given APLIC domain. Each child
+ APLIC domain is assigned a child index in increasing order, with the
+ first child APLIC domain assigned child index 0. The APLIC domain child
+ index is used by firmware to delegate interrupts from the given APLIC
+ domain to a particular child APLIC domain.
+
+ riscv,delegation:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ items:
+ - description: child APLIC domain phandle
+ - description: first interrupt number of the parent APLIC domain (inclusive)
+ - description: last interrupt number of the parent APLIC domain (inclusive)
+ description:
+ A interrupt delegation list where each entry is a triple consisting
+ of child APLIC domain phandle, first interrupt number of the parent
+ APLIC domain, and last interrupt number of the parent APLIC domain.
+ Firmware must configure interrupt delegation registers based on
+ interrupt delegation list.
+
+dependencies:
+ riscv,delegation: [ "riscv,children" ]
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - "#interrupt-cells"
+ - riscv,num-sources
+
+anyOf:
+ - required:
+ - interrupts-extended
+ - required:
+ - msi-parent
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0xc000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic1>, <&aplic2>;
+ riscv,delegation = <&aplic1 1 63>;
+ };
+
+ aplic1: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>;
+ reg = <0xd000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ aplic2: interrupt-controller@e000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0xe000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ - |
+ // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_mlevel>;
+ reg = <0xc000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic3>;
+ riscv,delegation = <&aplic3 1 63>;
+ };
+
+ aplic3: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_slevel>;
+ reg = <0xd000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+...
--
2.34.1

2023-10-23 17:31:01

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

The RISC-V advanced interrupt architecture (AIA) specification
defines a new MSI controller called incoming message signalled
interrupt controller (IMSIC) which manages MSI on per-HART (or
per-CPU) basis. It also supports IPIs as software injected MSIs.
(For more details refer https://github.com/riscv/riscv-aia)

Let us add an early irqchip driver for RISC-V IMSIC which sets
up the IMSIC state and provide IPIs.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 6 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-imsic-early.c | 235 ++++++
drivers/irqchip/irq-riscv-imsic-state.c | 962 ++++++++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 109 +++
include/linux/irqchip/riscv-imsic.h | 87 +++
6 files changed, 1400 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index f7149d0f3d45..bdd80716114d 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -546,6 +546,12 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_IMSIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_MSI_IRQ
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index ffd945fe71aa..d714724387ce 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
new file mode 100644
index 000000000000..23f689ff5807
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-imsic-state.h"
+
+static int imsic_parent_irq;
+
+#ifdef CONFIG_SMP
+static irqreturn_t imsic_local_sync_handler(int irq, void *data)
+{
+ imsic_local_sync();
+ return IRQ_HANDLED;
+}
+
+static void imsic_ipi_send(unsigned int cpu)
+{
+ struct imsic_local_config *local =
+ per_cpu_ptr(imsic->global.local, cpu);
+
+ writel(IMSIC_IPI_ID, local->msi_va);
+}
+
+static void imsic_ipi_starting_cpu(void)
+{
+ /* Enable IPIs for current CPU. */
+ __imsic_id_set_enable(IMSIC_IPI_ID);
+
+ /* Enable virtual IPI used for IMSIC ID synchronization */
+ enable_percpu_irq(imsic->ipi_virq, 0);
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+ /*
+ * Disable virtual IPI used for IMSIC ID synchronization so
+ * that we don't receive ID synchronization requests.
+ */
+ disable_percpu_irq(imsic->ipi_virq);
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ int virq;
+
+ /* Create IMSIC IPI multiplexing */
+ virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
+ if (virq <= 0)
+ return (virq < 0) ? virq : -ENOMEM;
+ imsic->ipi_virq = virq;
+
+ /* First vIRQ is used for IMSIC ID synchronization */
+ virq = request_percpu_irq(imsic->ipi_virq, imsic_local_sync_handler,
+ "riscv-imsic-lsync", imsic->global.local);
+ if (virq)
+ return virq;
+ irq_set_status_flags(imsic->ipi_virq, IRQ_HIDDEN);
+ imsic->ipi_lsync_desc = irq_to_desc(imsic->ipi_virq);
+
+ /* Set vIRQ range */
+ riscv_ipi_set_virq_range(imsic->ipi_virq + 1, IMSIC_NR_IPI - 1, true);
+
+ /* Announce that IMSIC is providing IPIs */
+ pr_info("%pfwP: providing IPIs using interrupt %d\n",
+ imsic->fwnode, IMSIC_IPI_ID);
+
+ return 0;
+}
+#else
+static void imsic_ipi_starting_cpu(void)
+{
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ return 0;
+}
+#endif
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ int err, cpu = smp_processor_id();
+ struct imsic_vector *vec;
+ unsigned long local_id;
+
+ chained_irq_enter(chip, desc);
+
+ while ((local_id = csr_swap(CSR_TOPEI, 0))) {
+ local_id = local_id >> TOPEI_ID_SHIFT;
+
+ if (local_id == IMSIC_IPI_ID) {
+#ifdef CONFIG_SMP
+ ipi_mux_process();
+#endif
+ continue;
+ }
+
+ if (unlikely(!imsic->base_domain))
+ continue;
+
+ vec = imsic_vector_from_local_id(cpu, local_id);
+ if (!vec) {
+ pr_warn_ratelimited(
+ "vector not found for local ID 0x%lx\n",
+ local_id);
+ continue;
+ }
+
+ err = generic_handle_domain_irq(imsic->base_domain,
+ vec->hwirq);
+ if (unlikely(err))
+ pr_warn_ratelimited(
+ "hwirq 0x%x mapping not found\n",
+ vec->hwirq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+ /* Enable per-CPU parent interrupt */
+ enable_percpu_irq(imsic_parent_irq,
+ irq_get_trigger_type(imsic_parent_irq));
+
+ /* Setup IPIs */
+ imsic_ipi_starting_cpu();
+
+ /*
+ * Interrupts identities might have been enabled/disabled while
+ * this CPU was not running so sync-up local enable/disable state.
+ */
+ imsic_local_sync();
+
+ /* Enable local interrupt delivery */
+ imsic_local_delivery(true);
+
+ return 0;
+}
+
+static int imsic_dying_cpu(unsigned int cpu)
+{
+ /* Cleanup IPIs */
+ imsic_ipi_dying_cpu();
+
+ return 0;
+}
+
+static int __init imsic_early_probe(struct fwnode_handle *fwnode)
+{
+ int rc;
+ struct irq_domain *domain;
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!domain) {
+ pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+ return -ENOENT;
+ }
+ imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (!imsic_parent_irq) {
+ pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+ return -ENOENT;
+ }
+ irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+ /* Initialize IPI domain */
+ rc = imsic_ipi_domain_init();
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+ return rc;
+ }
+
+ /*
+ * Setup cpuhp state (must be done after setting imsic_parent_irq)
+ *
+ * Don't disable per-CPU IMSIC file when CPU goes offline
+ * because this affects IPI and the masking/unmasking of
+ * virtual IPIs is done via generic IPI-Mux
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/imsic:starting",
+ imsic_starting_cpu, imsic_dying_cpu);
+
+ return 0;
+}
+
+static int __init imsic_early_dt_init(struct device_node *node,
+ struct device_node *parent)
+{
+ int rc;
+ struct fwnode_handle *fwnode = &node->fwnode;
+
+ /* Setup IMSIC state */
+ rc = imsic_setup_state(fwnode);
+ if (rc) {
+ pr_err("%pfwP: failed to setup state (error %d)\n",
+ fwnode, rc);
+ return rc;
+ }
+
+ /* Do early setup of IPIs */
+ rc = imsic_early_probe(fwnode);
+ if (rc)
+ return rc;
+
+ /* Ensure that OF platform device gets probed */
+ of_node_clear_flag(node, OF_POPULATED);
+ return 0;
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
new file mode 100644
index 000000000000..54465e47851c
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-state.c
@@ -0,0 +1,962 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/cpu.h>
+#include <linux/bitmap.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/seq_file.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#include "irq-riscv-imsic-state.h"
+
+#define IMSIC_DISABLE_EIDELIVERY 0
+#define IMSIC_ENABLE_EIDELIVERY 1
+#define IMSIC_DISABLE_EITHRESHOLD 1
+#define IMSIC_ENABLE_EITHRESHOLD 0
+
+#define imsic_csr_write(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_write(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_read(__c) \
+({ \
+ unsigned long __v; \
+ csr_write(CSR_ISELECT, __c); \
+ __v = csr_read(CSR_IREG); \
+ __v; \
+})
+
+#define imsic_csr_read_clear(__c, __v) \
+({ \
+ unsigned long __r; \
+ csr_write(CSR_ISELECT, __c); \
+ __r = csr_read_clear(CSR_IREG, __v); \
+ __r; \
+})
+
+#define imsic_csr_set(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_set(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_clear(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_clear(CSR_IREG, __v); \
+} while (0)
+
+struct imsic_priv *imsic;
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return (imsic) ? &imsic->global : NULL;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+static bool __imsic_eix_read_clear(unsigned long id, bool pend)
+{
+ unsigned long isel, imask;
+
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+ imask = BIT(id & (__riscv_xlen - 1));
+
+ return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;
+}
+
+#define __imsic_id_read_clear_enabled(__id) \
+ __imsic_eix_read_clear((__id), false)
+#define __imsic_id_read_clear_pending(__id) \
+ __imsic_eix_read_clear((__id), true)
+
+void __imsic_eix_update(unsigned long base_id,
+ unsigned long num_id, bool pend, bool val)
+{
+ unsigned long i, isel, ireg;
+ unsigned long id = base_id, last_id = base_id + num_id;
+
+ while (id < last_id) {
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+ ireg = 0;
+ for (i = id & (__riscv_xlen - 1);
+ (id < last_id) && (i < __riscv_xlen); i++) {
+ ireg |= BIT(i);
+ id++;
+ }
+
+ /*
+ * The IMSIC EIEx and EIPx registers are indirectly
+ * accessed via using ISELECT and IREG CSRs so we
+ * need to access these CSRs without getting preempted.
+ *
+ * All existing users of this function call this
+ * function with local IRQs disabled so we don't
+ * need to do anything special here.
+ */
+ if (val)
+ imsic_csr_set(isel, ireg);
+ else
+ imsic_csr_clear(isel, ireg);
+ }
+}
+
+void imsic_local_sync(void)
+{
+ struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
+ struct imsic_local_config *mlocal;
+ struct imsic_vector *mvec;
+ unsigned long flags;
+ int i;
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ for (i = 1; i <= imsic->global.nr_ids; i++) {
+ if (i == IMSIC_IPI_ID)
+ continue;
+
+ if (test_bit(i, lpriv->ids_enabled_bitmap))
+ __imsic_id_set_enable(i);
+ else
+ __imsic_id_clear_enable(i);
+
+ mvec = lpriv->ids_move[i];
+ lpriv->ids_move[i] = NULL;
+ if (mvec) {
+ if (__imsic_id_read_clear_pending(i)) {
+ mlocal = per_cpu_ptr(imsic->global.local,
+ mvec->cpu);
+ writel(mvec->local_id, mlocal->msi_va);
+ }
+
+ lpriv->vectors[i].hwirq = UINT_MAX;
+ lpriv->vectors[i].order = UINT_MAX;
+ clear_bit(i, lpriv->ids_used_bitmap);
+ }
+
+ }
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+}
+
+void imsic_local_delivery(bool enable)
+{
+ if (enable) {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+ } else {
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+ }
+}
+
+#ifdef CONFIG_SMP
+static void imsic_remote_sync(unsigned int cpu)
+{
+ /*
+ * We simply inject ID synchronization IPI to a target CPU
+ * if it is not same as the current CPU. The ipi_send_mask()
+ * implementation of IPI mux will inject ID synchronization
+ * IPI only for CPUs that have enabled it so offline CPUs
+ * won't receive IPI. An offline CPU will unconditionally
+ * synchronize IDs through imsic_starting_cpu() when the
+ * CPU is brought up.
+ */
+ if (cpu_online(cpu)) {
+ if (cpu != smp_processor_id())
+ __ipi_send_mask(imsic->ipi_lsync_desc, cpumask_of(cpu));
+ else
+ imsic_local_sync();
+ }
+}
+#else
+static inline void imsic_remote_sync(unsigned int cpu)
+{
+ imsic_local_sync();
+}
+#endif
+
+void imsic_vector_mask(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+
+ imsic_remote_sync(vec->cpu);
+}
+
+void imsic_vector_unmask(struct imsic_vector *vec)
+{
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ bitmap_set(lpriv->ids_enabled_bitmap, vec->local_id, 1);
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+
+ imsic_remote_sync(vec->cpu);
+}
+
+void imsic_vector_move(struct imsic_vector *old_vec,
+ struct imsic_vector *new_vec)
+{
+ struct imsic_local_priv *old_lpriv, *new_lpriv;
+ struct imsic_vector *ovec, *nvec;
+ unsigned long flags, flags1;
+ unsigned int i;
+
+ if (WARN_ON(old_vec->cpu == new_vec->cpu ||
+ old_vec->order != new_vec->order ||
+ (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
+ (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
+ return;
+
+ old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
+ if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
+ return;
+
+ new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
+ if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
+ return;
+
+ raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
+ raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
+
+ /* Move the state of each vector entry */
+ for (i = 0; i < BIT(old_vec->order); i++) {
+ ovec = old_vec + i;
+ nvec = new_vec + i;
+
+ /* Unmask the new vector entry */
+ if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
+ bitmap_set(new_lpriv->ids_enabled_bitmap,
+ nvec->local_id, 1);
+
+ /* Mask the old vector entry */
+ bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
+
+ /*
+ * Move and re-trigger the new vector entry based on the
+ * pending state of the old vector entry because we might
+ * get a device interrupt on the old vector entry while
+ * device was being moved to the new vector entry.
+ */
+ old_lpriv->ids_move[ovec->local_id] = nvec;
+ }
+
+ raw_spin_unlock_irqrestore(&new_lpriv->ids_lock, flags1);
+ raw_spin_unlock_irqrestore(&old_lpriv->ids_lock, flags);
+
+ imsic_remote_sync(old_vec->cpu);
+ imsic_remote_sync(new_vec->cpu);
+}
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+void imsic_vector_debug_show(struct seq_file *m,
+ struct imsic_vector *vec, int ind)
+{
+ unsigned int mcpu = 0, mlocal_id = 0;
+ struct imsic_local_priv *lpriv;
+ bool move_in_progress = false;
+ struct imsic_vector *mvec;
+ bool is_enabled = false;
+ unsigned long flags;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ if (test_bit(vec->local_id, lpriv->ids_enabled_bitmap))
+ is_enabled = true;
+ mvec = lpriv->ids_move[vec->local_id];
+ if (mvec) {
+ move_in_progress = true;
+ mcpu = mvec->cpu;
+ mlocal_id = mvec->local_id;
+ }
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+
+ seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
+ seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
+ seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
+ (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
+ seq_printf(m, "%*sis_enabled : %5u\n", ind, "",
+ (move_in_progress) ? 1 : 0);
+ seq_printf(m, "%*sis_move_pending : %5u\n", ind, "",
+ (move_in_progress) ? 1 : 0);
+ if (move_in_progress) {
+ seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mcpu);
+ seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mlocal_id);
+ }
+}
+
+void imsic_vector_debug_show_summary(struct seq_file *m, int ind)
+{
+ unsigned int cpu, total_avail = 0, total_used = 0;
+ struct imsic_global_config *global = &imsic->global;
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+
+ for_each_possible_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ total_avail += global->nr_ids;
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ total_used += bitmap_weight(lpriv->ids_used_bitmap,
+ global->nr_ids + 1) - 1;
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+ }
+
+ seq_printf(m, "%*stotal : %5u\n", ind, "", total_avail);
+ seq_printf(m, "%*sused : %5u\n", ind, "", total_used);
+ seq_printf(m, "%*s| CPU | tot | usd | vectors\n", ind, " ");
+
+ cpus_read_lock();
+ for_each_online_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ total_used = bitmap_weight(lpriv->ids_used_bitmap,
+ global->nr_ids + 1) - 1;
+ seq_printf(m, "%*s %4d %4u %4u %*pbl\n", ind, " ",
+ cpu, global->nr_ids, total_used,
+ global->nr_ids + 1, lpriv->ids_used_bitmap);
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+ }
+ cpus_read_unlock();
+}
+#endif
+
+struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu,
+ unsigned int local_id)
+{
+ struct imsic_local_priv *lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ if (!lpriv || imsic->global.nr_ids < local_id)
+ return NULL;
+
+ return &lpriv->vectors[local_id];
+}
+
+static unsigned int imsic_vector_best_cpu(const struct cpumask *mask,
+ unsigned int order)
+{
+ struct imsic_global_config *global = &imsic->global;
+ unsigned int cpu, best_cpu, free, maxfree = 0;
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+
+ best_cpu = UINT_MAX;
+ for_each_cpu(cpu, mask) {
+ if (!cpu_online(cpu))
+ continue;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ free = bitmap_weight(lpriv->ids_used_bitmap,
+ global->nr_ids + 1);
+ free = (global->nr_ids + 1) - free;
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+
+ if (free < BIT(order) || free <= maxfree)
+ continue;
+
+ best_cpu = cpu;
+ maxfree = free;
+ }
+
+ return best_cpu;
+}
+
+struct imsic_vector *imsic_vector_alloc(unsigned int hwirq,
+ const struct cpumask *mask,
+ unsigned int order)
+{
+ struct imsic_vector *vec = NULL;
+ struct imsic_local_priv *lpriv;
+ unsigned long flags;
+ unsigned int cpu;
+ int i, local_id;
+
+ if (!mask || cpumask_empty(mask))
+ return NULL;
+
+ cpu = imsic_vector_best_cpu(mask, order);
+ if (cpu == UINT_MAX)
+ return NULL;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ local_id = bitmap_find_free_region(lpriv->ids_used_bitmap,
+ imsic->global.nr_ids + 1,
+ order);
+ if (local_id > 0) {
+ for (i = 0; i < BIT(order); i++) {
+ vec = &lpriv->vectors[local_id + i];
+ vec->hwirq = hwirq + i;
+ vec->order = order;
+ }
+ vec = &lpriv->vectors[local_id];
+ }
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+
+ return vec;
+}
+
+void imsic_vector_free(struct imsic_vector *vec)
+{
+ unsigned int i, local_id, order;
+ struct imsic_local_priv *lpriv;
+ struct imsic_vector *tvec;
+ unsigned long flags;
+
+ if (WARN_ON(vec->hwirq == UINT_MAX || vec->order == UINT_MAX))
+ return;
+
+ lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
+ if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
+ return;
+
+ order = vec->order;
+ local_id = IMSIC_VECTOR_BASE_LOCAL_ID(vec);
+
+ raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
+ for (i = 0; i < BIT(order); i++) {
+ tvec = &lpriv->vectors[local_id + i];
+ tvec->hwirq = UINT_MAX;
+ tvec->order = UINT_MAX;
+ }
+ bitmap_release_region(lpriv->ids_used_bitmap, local_id, order);
+ raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
+}
+
+static void __init imsic_local_cleanup(void)
+{
+ int cpu;
+ struct imsic_local_priv *lpriv;
+
+ for_each_possible_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ bitmap_free(lpriv->ids_enabled_bitmap);
+ bitmap_free(lpriv->ids_used_bitmap);
+ kfree(lpriv->ids_move);
+ kfree(lpriv->vectors);
+ }
+
+ free_percpu(imsic->lpriv);
+}
+
+static int __init imsic_local_init(void)
+{
+ struct imsic_global_config *global = &imsic->global;
+ struct imsic_local_priv *lpriv;
+ struct imsic_vector *vec;
+ int cpu, i;
+
+ /* Allocate per-CPU private state */
+ imsic->lpriv = alloc_percpu(typeof(*(imsic->lpriv)));
+ if (!imsic->lpriv)
+ return -ENOMEM;
+
+ /* Setup per-CPU private state */
+ for_each_possible_cpu(cpu) {
+ lpriv = per_cpu_ptr(imsic->lpriv, cpu);
+
+ raw_spin_lock_init(&lpriv->ids_lock);
+
+ /* Allocate used bitmap */
+ lpriv->ids_used_bitmap = bitmap_zalloc(global->nr_ids + 1,
+ GFP_KERNEL);
+ if (!lpriv->ids_used_bitmap) {
+ imsic_local_cleanup();
+ return -ENOMEM;
+ }
+
+ /* Allocate enabled bitmap */
+ lpriv->ids_enabled_bitmap = bitmap_zalloc(global->nr_ids + 1,
+ GFP_KERNEL);
+ if (!lpriv->ids_enabled_bitmap) {
+ imsic_local_cleanup();
+ return -ENOMEM;
+ }
+
+ /* Allocate move array */
+ lpriv->ids_move = kcalloc(global->nr_ids + 1,
+ sizeof(*lpriv->ids_move), GFP_KERNEL);
+ if (!lpriv->ids_move) {
+ imsic_local_cleanup();
+ return -ENOMEM;
+ }
+
+ /* Reserve ID#0 because it is special and never implemented */
+ bitmap_set(lpriv->ids_used_bitmap, 0, 1);
+
+ /* Reserve IPI ID because it is special and used internally */
+ bitmap_set(lpriv->ids_used_bitmap, IMSIC_IPI_ID, 1);
+
+ /* Allocate vector array */
+ lpriv->vectors = kcalloc(global->nr_ids + 1,
+ sizeof(*lpriv->vectors), GFP_KERNEL);
+ if (!lpriv->vectors) {
+ imsic_local_cleanup();
+ return -ENOMEM;
+ }
+
+ /* Setup vector array */
+ for (i = 0; i <= global->nr_ids; i++) {
+ vec = &lpriv->vectors[i];
+ vec->cpu = cpu;
+ vec->local_id = i;
+ vec->hwirq = UINT_MAX;
+ vec->order = UINT_MAX;
+ }
+ }
+
+ return 0;
+}
+
+int imsic_hwirqs_alloc(unsigned int order)
+{
+ int ret;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->hwirqs_lock, flags);
+ ret = bitmap_find_free_region(imsic->hwirqs_used_bitmap,
+ imsic->nr_hwirqs, order);
+ raw_spin_unlock_irqrestore(&imsic->hwirqs_lock, flags);
+
+ return ret;
+}
+
+void imsic_hwirqs_free(unsigned int base_hwirq, unsigned int order)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->hwirqs_lock, flags);
+ bitmap_release_region(imsic->hwirqs_used_bitmap, base_hwirq, order);
+ raw_spin_unlock_irqrestore(&imsic->hwirqs_lock, flags);
+}
+
+static int __init imsic_hwirqs_init(void)
+{
+ struct imsic_global_config *global = &imsic->global;
+
+ imsic->nr_hwirqs = num_possible_cpus() * global->nr_ids;
+
+ raw_spin_lock_init(&imsic->hwirqs_lock);
+
+ imsic->hwirqs_used_bitmap = bitmap_zalloc(imsic->nr_hwirqs,
+ GFP_KERNEL);
+ if (!imsic->hwirqs_used_bitmap)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void __init imsic_hwirqs_cleanup(void)
+{
+ bitmap_free(imsic->hwirqs_used_bitmap);
+}
+
+static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
+ u32 index, unsigned long *hartid)
+{
+ int rc;
+ struct of_phandle_args parent;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(fwnode), index, &parent);
+ if (rc)
+ return rc;
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (parent.args[0] != RV_IRQ_EXT)
+ return -EINVAL;
+
+ return riscv_of_parent_hartid(parent.np, hartid);
+}
+
+static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
+ u32 index, struct resource *res)
+{
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static int __init imsic_parse_fwnode(struct fwnode_handle *fwnode,
+ struct imsic_global_config *global,
+ u32 *nr_parent_irqs,
+ u32 *nr_mmios)
+{
+ unsigned long hartid;
+ struct resource res;
+ int rc;
+ u32 i;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+
+ *nr_parent_irqs = 0;
+ *nr_mmios = 0;
+
+ /* Find number of parent interrupts */
+ *nr_parent_irqs = 0;
+ while (!imsic_get_parent_hartid(fwnode, *nr_parent_irqs, &hartid))
+ (*nr_parent_irqs)++;
+ if (!(*nr_parent_irqs)) {
+ pr_err("%pfwP: no parent irqs available\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of guest index bits in MSI address */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,guest-index-bits",
+ &global->guest_index_bits);
+ if (rc)
+ global->guest_index_bits = 0;
+
+ /* Find number of HART index bits */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,hart-index-bits",
+ &global->hart_index_bits);
+ if (rc) {
+ /* Assume default value */
+ global->hart_index_bits = __fls(*nr_parent_irqs);
+ if (BIT(global->hart_index_bits) < *nr_parent_irqs)
+ global->hart_index_bits++;
+ }
+
+ /* Find number of group index bits */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,group-index-bits",
+ &global->group_index_bits);
+ if (rc)
+ global->group_index_bits = 0;
+
+ /*
+ * Find first bit position of group index.
+ * If not specified assumed the default APLIC-IMSIC configuration.
+ */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,group-index-shift",
+ &global->group_index_shift);
+ if (rc)
+ global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+
+ /* Find number of interrupt identities */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,num-ids",
+ &global->nr_ids);
+ if (rc) {
+ pr_err("%pfwP: number of interrupt identities not found\n",
+ fwnode);
+ return rc;
+ }
+
+ /* Find number of guest interrupt identities */
+ rc = of_property_read_u32(to_of_node(fwnode),
+ "riscv,num-guest-ids",
+ &global->nr_guest_ids);
+ if (rc)
+ global->nr_guest_ids = global->nr_ids;
+
+ /* Sanity check guest index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+ if (i < global->guest_index_bits) {
+ pr_err("%pfwP: guest index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check HART index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
+ if (i < global->hart_index_bits) {
+ pr_err("%pfwP: HART index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check group index bits */
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+ global->guest_index_bits - global->hart_index_bits;
+ if (i < global->group_index_bits) {
+ pr_err("%pfwP: group index bits too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check group index shift */
+ i = global->group_index_bits + global->group_index_shift - 1;
+ if (i >= BITS_PER_LONG) {
+ pr_err("%pfwP: group index shift too big\n", fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check number of interrupt identities */
+ if ((global->nr_ids < IMSIC_MIN_ID) ||
+ (global->nr_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Sanity check number of guest interrupt identities */
+ if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+ (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of guest interrupt identities\n",
+ fwnode);
+ return -EINVAL;
+ }
+
+ /* Compute base address */
+ rc = imsic_get_mmio_resource(fwnode, 0, &res);
+ if (rc) {
+ pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+ return -EINVAL;
+ }
+ global->base_addr = res.start;
+ global->base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+
+ /* Find number of MMIO register sets */
+ while (!imsic_get_mmio_resource(fwnode, *nr_mmios, &res))
+ (*nr_mmios)++;
+
+ return 0;
+}
+
+int __init imsic_setup_state(struct fwnode_handle *fwnode)
+{
+ int rc, cpu;
+ phys_addr_t base_addr;
+ void __iomem **mmios_va = NULL;
+ struct resource *mmios = NULL;
+ struct imsic_local_config *local;
+ struct imsic_global_config *global;
+ unsigned long reloff, hartid;
+ u32 i, j, index, nr_parent_irqs, nr_mmios, nr_handlers = 0;
+
+ /*
+ * Only one IMSIC instance allowed in a platform for clean
+ * implementation of SMP IRQ affinity and per-CPU IPIs.
+ *
+ * This means on a multi-socket (or multi-die) platform we
+ * will have multiple MMIO regions for one IMSIC instance.
+ */
+ if (imsic) {
+ pr_err("%pfwP: already initialized hence ignoring\n",
+ fwnode);
+ return -EALREADY;
+ }
+
+ if (!riscv_isa_extension_available(NULL, SxAIA)) {
+ pr_err("%pfwP: AIA support not available\n", fwnode);
+ return -ENODEV;
+ }
+
+ imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
+ if (!imsic)
+ return -ENOMEM;
+ imsic->fwnode = fwnode;
+ global = &imsic->global;
+
+ global->local = alloc_percpu(typeof(*(global->local)));
+ if (!global->local) {
+ rc = -ENOMEM;
+ goto out_free_priv;
+ }
+
+ /* Parse IMSIC fwnode */
+ rc = imsic_parse_fwnode(fwnode, global, &nr_parent_irqs, &nr_mmios);
+ if (rc)
+ goto out_free_local;
+
+ /* Allocate MMIO resource array */
+ mmios = kcalloc(nr_mmios, sizeof(*mmios), GFP_KERNEL);
+ if (!mmios) {
+ rc = -ENOMEM;
+ goto out_free_local;
+ }
+
+ /* Allocate MMIO virtual address array */
+ mmios_va = kcalloc(nr_mmios, sizeof(*mmios_va), GFP_KERNEL);
+ if (!mmios_va) {
+ rc = -ENOMEM;
+ goto out_iounmap;
+ }
+
+ /* Parse and map MMIO register sets */
+ for (i = 0; i < nr_mmios; i++) {
+ rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
+ if (rc) {
+ pr_err("%pfwP: unable to parse MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ base_addr = mmios[i].start;
+ base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+ if (base_addr != global->base_addr) {
+ rc = -EINVAL;
+ pr_err("%pfwP: address mismatch for regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ mmios_va[i] = ioremap(mmios[i].start, resource_size(&mmios[i]));
+ if (!mmios_va[i]) {
+ rc = -EIO;
+ pr_err("%pfwP: unable to map MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+ }
+
+ /* Initialize HW interrupt numbers */
+ rc = imsic_hwirqs_init();
+ if (rc) {
+ pr_err("%pfwP: failed to initialize HW interrupts numbers\n",
+ fwnode);
+ goto out_iounmap;
+ }
+
+ /* Initialize local (or per-CPU )state */
+ rc = imsic_local_init();
+ if (rc) {
+ pr_err("%pfwP: failed to initialize local state\n",
+ fwnode);
+ goto out_hwirqs_cleanup;
+ }
+
+ /* Configure handlers for target CPUs */
+ for (i = 0; i < nr_parent_irqs; i++) {
+ rc = imsic_get_parent_hartid(fwnode, i, &hartid);
+ if (rc) {
+ pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+ fwnode, i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ /* Find MMIO location of MSI page */
+ index = nr_mmios;
+ reloff = i * BIT(global->guest_index_bits) *
+ IMSIC_MMIO_PAGE_SZ;
+ for (j = 0; nr_mmios; j++) {
+ if (reloff < resource_size(&mmios[j])) {
+ index = j;
+ break;
+ }
+
+ /*
+ * MMIO region size may not be aligned to
+ * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+ * if holes are present.
+ */
+ reloff -= ALIGN(resource_size(&mmios[j]),
+ BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+ }
+ if (index >= nr_mmios) {
+ pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ local = per_cpu_ptr(global->local, cpu);
+ local->msi_pa = mmios[index].start + reloff;
+ local->msi_va = mmios_va[index] + reloff;
+
+ nr_handlers++;
+ }
+
+ /* If no CPU handlers found then can't take interrupts */
+ if (!nr_handlers) {
+ pr_err("%pfwP: No CPU handlers found\n", fwnode);
+ rc = -ENODEV;
+ goto out_local_cleanup;
+ }
+
+ /* We don't need MMIO arrays anymore so let's free-up */
+ kfree(mmios_va);
+ kfree(mmios);
+
+ return 0;
+
+out_local_cleanup:
+ imsic_local_cleanup();
+out_hwirqs_cleanup:
+ imsic_hwirqs_cleanup();
+out_iounmap:
+ for (i = 0; i < nr_mmios; i++) {
+ if (mmios_va[i])
+ iounmap(mmios_va[i]);
+ }
+ kfree(mmios_va);
+ kfree(mmios);
+out_free_local:
+ free_percpu(imsic->global.local);
+out_free_priv:
+ kfree(imsic);
+ imsic = NULL;
+ return rc;
+}
diff --git a/drivers/irqchip/irq-riscv-imsic-state.h b/drivers/irqchip/irq-riscv-imsic-state.h
new file mode 100644
index 000000000000..82911b8b08b4
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-state.h
@@ -0,0 +1,109 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#ifndef _IRQ_RISCV_IMSIC_STATE_H
+#define _IRQ_RISCV_IMSIC_STATE_H
+
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/fwnode.h>
+
+/*
+ * The IMSIC driver uses 1 IPI for ID synchronization and
+ * arch/riscv/kernel/smp.c require 6 IPIs so we fix the
+ * total number of IPIs to 8.
+ */
+#define IMSIC_IPI_ID 1
+#define IMSIC_NR_IPI 8
+
+struct imsic_vector {
+ /* Fixed details of the vector */
+ unsigned int cpu;
+ unsigned int local_id;
+ /* Details saved by driver in the vector */
+ unsigned int hwirq;
+ unsigned int order;
+};
+
+#define IMSIC_VECTOR_MASK(__v) \
+ (BIT((__v)->order) - 1UL)
+#define IMSIC_VECTOR_BASE_LOCAL_ID(__v) \
+ ((__v)->local_id & ~IMSIC_VECTOR_MASK(__v))
+#define IMSIC_VECTOR_BASE_HWIRQ(__v) \
+ ((__v)->hwirq & ~IMSIC_VECTOR_MASK(__v))
+
+struct imsic_local_priv {
+ /* Local state of interrupt identities */
+ raw_spinlock_t ids_lock;
+ unsigned long *ids_used_bitmap;
+ unsigned long *ids_enabled_bitmap;
+ struct imsic_vector **ids_move;
+
+ /* Local vector table */
+ struct imsic_vector *vectors;
+};
+
+struct imsic_priv {
+ /* Device details */
+ struct fwnode_handle *fwnode;
+
+ /* Global configuration common for all HARTs */
+ struct imsic_global_config global;
+
+ /* Dummy HW interrupt numbers */
+ unsigned int nr_hwirqs;
+ raw_spinlock_t hwirqs_lock;
+ unsigned long *hwirqs_used_bitmap;
+
+ /* Per-CPU state */
+ struct imsic_local_priv __percpu *lpriv;
+
+ /* IPI interrupt identity and synchronization */
+ int ipi_virq;
+ struct irq_desc *ipi_lsync_desc;
+
+ /* IRQ domains (created by platform driver) */
+ struct irq_domain *base_domain;
+ struct irq_domain *plat_domain;
+};
+
+extern struct imsic_priv *imsic;
+
+void __imsic_eix_update(unsigned long base_id,
+ unsigned long num_id, bool pend, bool val);
+
+#define __imsic_id_set_enable(__id) \
+ __imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_clear_enable(__id) \
+ __imsic_eix_update((__id), 1, false, false)
+
+void imsic_local_sync(void);
+void imsic_local_delivery(bool enable);
+
+void imsic_vector_mask(struct imsic_vector *vec);
+void imsic_vector_unmask(struct imsic_vector *vec);
+void imsic_vector_move(struct imsic_vector *old_vec,
+ struct imsic_vector *new_vec);
+
+struct imsic_vector *imsic_vector_from_local_id(unsigned int cpu,
+ unsigned int local_id);
+
+struct imsic_vector *imsic_vector_alloc(unsigned int hwirq,
+ const struct cpumask *mask,
+ unsigned int order);
+void imsic_vector_free(struct imsic_vector *vector);
+
+void imsic_vector_debug_show(struct seq_file *m,
+ struct imsic_vector *vec, int ind);
+
+void imsic_vector_debug_show_summary(struct seq_file *m, int ind);
+
+int imsic_hwirqs_alloc(unsigned int order);
+void imsic_hwirqs_free(unsigned int base_hwirq, unsigned int order);
+
+int imsic_setup_state(struct fwnode_handle *fwnode);
+
+#endif
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..cbb7bcd0e4dd
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT 12
+#define IMSIC_MMIO_PAGE_SZ BIT(IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE 0x00
+#define IMSIC_MMIO_PAGE_BE 0x04
+
+#define IMSIC_MIN_ID 63
+#define IMSIC_MAX_ID 2048
+
+#define IMSIC_EIDELIVERY 0x70
+
+#define IMSIC_EITHRESHOLD 0x72
+
+#define IMSIC_EIP0 0x80
+#define IMSIC_EIP63 0xbf
+#define IMSIC_EIPx_BITS 32
+
+#define IMSIC_EIE0 0xc0
+#define IMSIC_EIE63 0xff
+#define IMSIC_EIEx_BITS 32
+
+#define IMSIC_FIRST IMSIC_EIDELIVERY
+#define IMSIC_LAST IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE 0x00
+#define IMSIC_MMIO_SETIPNUM_BE 0x04
+
+struct imsic_local_config {
+ phys_addr_t msi_pa;
+ void __iomem *msi_va;
+};
+
+struct imsic_global_config {
+ /*
+ * MSI Target Address Scheme
+ *
+ * XLEN-1 12 0
+ * | | |
+ * -------------------------------------------------------------
+ * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ * -------------------------------------------------------------
+ */
+
+ /* Bits representing Guest index, HART index, and Group index */
+ u32 guest_index_bits;
+ u32 hart_index_bits;
+ u32 group_index_bits;
+ u32 group_index_shift;
+
+ /* Global base address matching all target MSI addresses */
+ phys_addr_t base_addr;
+
+ /* Number of interrupt identities */
+ u32 nr_ids;
+
+ /* Number of guest interrupt identities */
+ u32 nr_guest_ids;
+
+ /* Per-CPU IMSIC addresses */
+ struct imsic_local_config __percpu *local;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return NULL;
+}
+
+#endif
+
+#endif
--
2.34.1

2023-10-23 17:31:07

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 14/14] MAINTAINERS: Add entry for RISC-V AIA drivers

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <[email protected]>
---
MAINTAINERS | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 801a2f44182c..4557675c6086 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18410,6 +18410,20 @@ S: Maintained
F: drivers/mtd/nand/raw/r852.c
F: drivers/mtd/nand/raw/r852.h

+RISC-V AIA DRIVERS
+M: Anup Patel <[email protected]>
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
+F: drivers/irqchip/irq-riscv-aplic-*.c
+F: drivers/irqchip/irq-riscv-aplic-*.h
+F: drivers/irqchip/irq-riscv-imsic-*.c
+F: drivers/irqchip/irq-riscv-imsic-*.h
+F: drivers/irqchip/irq-riscv-intc.c
+F: include/linux/irqchip/riscv-aplic.h
+F: include/linux/irqchip/riscv-imsic.h
+
RISC-V ARCHITECTURE
M: Paul Walmsley <[email protected]>
M: Palmer Dabbelt <[email protected]>
--
2.34.1

2023-10-23 17:31:22

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode

The RISC-V advanced platform-level interrupt controller (APLIC) has
two modes of operation: 1) Direct mode and 2) MSI mode.
(For more details, refer https://github.com/riscv/riscv-aia)

In APLIC MSI-mode, wired interrupts are forwared as message signaled
interrupts (MSIs) to CPUs via IMSIC.

We extend the existing APLIC irqchip driver to support MSI-mode for
RISC-V platforms having both wired interrupts and MSIs.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 6 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
drivers/irqchip/irq-riscv-aplic-main.h | 8 +
drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
5 files changed, 301 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 1996cc6f666a..7adc4dbe07ff 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -551,6 +551,12 @@ config RISCV_APLIC
depends on RISCV
select IRQ_DOMAIN_HIERARCHY

+config RISCV_APLIC_MSI
+ bool
+ depends on RISCV_APLIC
+ select GENERIC_MSI_IRQ
+ default RISCV_APLIC
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 7f8289790ed8..47995fdb2c60 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
index 87450708a733..d1b342b66551 100644
--- a/drivers/irqchip/irq-riscv-aplic-main.c
+++ b/drivers/irqchip/irq-riscv-aplic-main.c
@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
msi_mode = of_property_present(to_of_node(dev->fwnode),
"msi-parent");
if (msi_mode)
- rc = -ENODEV;
+ rc = aplic_msi_setup(dev, regs);
else
rc = aplic_direct_setup(dev, regs);
if (rc) {
diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
index 474a04229334..78267ec58098 100644
--- a/drivers/irqchip/irq-riscv-aplic-main.h
+++ b/drivers/irqchip/irq-riscv-aplic-main.h
@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
void __iomem *regs);
int aplic_direct_setup(struct device *dev, void __iomem *regs);
+#ifdef CONFIG_RISCV_APLIC_MSI
+int aplic_msi_setup(struct device *dev, void __iomem *regs);
+#else
+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
+{
+ return -ENODEV;
+}
+#endif

#endif
diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
new file mode 100644
index 000000000000..086d00e0429e
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
@@ -0,0 +1,285 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-aplic-main.h"
+
+static void aplic_msi_irq_unmask(struct irq_data *d)
+{
+ aplic_irq_unmask(d);
+ irq_chip_unmask_parent(d);
+}
+
+static void aplic_msi_irq_mask(struct irq_data *d)
+{
+ aplic_irq_mask(d);
+ irq_chip_mask_parent(d);
+}
+
+static void aplic_msi_irq_eoi(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ u32 reg_off, reg_mask;
+
+ /*
+ * EOI handling only required only for level-triggered
+ * interrupts in APLIC MSI mode.
+ */
+
+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
+ switch (irqd_get_trigger_type(d)) {
+ case IRQ_TYPE_LEVEL_LOW:
+ if (!(readl(priv->regs + reg_off) & reg_mask))
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ if (readl(priv->regs + reg_off) & reg_mask)
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ }
+}
+
+static struct irq_chip aplic_msi_chip = {
+ .name = "APLIC-MSI",
+ .irq_mask = aplic_msi_irq_mask,
+ .irq_unmask = aplic_msi_irq_unmask,
+ .irq_set_type = aplic_irq_set_type,
+ .irq_eoi = aplic_msi_irq_eoi,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+#endif
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ struct aplic_priv *priv = platform_msi_get_host_data(d);
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &aplic_msi_chip, priv, handle_fasteoi_irq,
+ NULL, NULL);
+ /*
+ * APLIC does not implement irq_disable() so Linux interrupt
+ * subsystem will take a lazy approach for disabling an APLIC
+ * interrupt. This means APLIC interrupts are left unmasked
+ * upon system suspend and interrupts are not processed
+ * immediately upon system wake up. To tackle this, we disable
+ * the lazy approach for all APLIC interrupts.
+ */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_msi_irqdomain_ops = {
+ .translate = aplic_msi_irqdomain_translate,
+ .alloc = aplic_msi_irqdomain_alloc,
+ .free = platform_msi_device_domain_free,
+};
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+ unsigned int group_index, hart_index, guest_index, val;
+ struct irq_data *d = irq_get_irq_data(desc->irq);
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_msicfg *mc = &priv->msicfg;
+ phys_addr_t tppn, tbppn, msg_addr;
+ void __iomem *target;
+
+ /* For zeroed MSI, simply write zero into the target register */
+ if (!msg->address_hi && !msg->address_lo && !msg->data) {
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ writel(0, target);
+ return;
+ }
+
+ /* Sanity check on message data */
+ WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+ /* Compute target MSI address */
+ msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+ tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+ /* Compute target HART Base PPN */
+ tbppn = tppn;
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+ WARN_ON(tbppn != mc->base_ppn);
+
+ /* Compute target group and hart indexes */
+ group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+ APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+ hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+ APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+ hart_index |= (group_index << mc->lhxw);
+ WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+ /* Compute target guest index */
+ guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+ /* Update IRQ TARGET register */
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+ << APLIC_TARGET_HART_IDX_SHIFT;
+ val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+ << APLIC_TARGET_GUEST_IDX_SHIFT;
+ val |= (msg->data & APLIC_TARGET_EIID_MASK);
+ writel(val, target);
+}
+
+int aplic_msi_setup(struct device *dev, void __iomem *regs)
+{
+ const struct imsic_global_config *imsic_global;
+ struct irq_domain *irqdomain;
+ struct aplic_priv *priv;
+ struct aplic_msicfg *mc;
+ phys_addr_t pa;
+ int rc;
+
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ rc = aplic_setup_priv(priv, dev, regs);
+ if (!priv) {
+ dev_err(dev, "failed to create APLIC context\n");
+ return rc;
+ }
+ mc = &priv->msicfg;
+
+ /*
+ * The APLIC outgoing MSI config registers assume target MSI
+ * controller to be RISC-V AIA IMSIC controller.
+ */
+ imsic_global = imsic_get_global_config();
+ if (!imsic_global) {
+ dev_err(dev, "IMSIC global config not found\n");
+ return -ENODEV;
+ }
+
+ /* Find number of guest index bits (LHXS) */
+ mc->lhxs = imsic_global->guest_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+ dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
+ return -EINVAL;
+ }
+
+ /* Find number of HART index bits (LHXW) */
+ mc->lhxw = imsic_global->hart_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+ dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find number of group index bits (HHXW) */
+ mc->hhxw = imsic_global->group_index_bits;
+ if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+ dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
+ return -EINVAL;
+ }
+
+ /* Find first bit position of group index (HHXS) */
+ mc->hhxs = imsic_global->group_index_shift;
+ if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+ dev_err(dev, "IMSIC group index shift should be >= %d\n",
+ (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+ return -EINVAL;
+ }
+ mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+ if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+ dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
+ return -EINVAL;
+ }
+
+ /* Compute PPN base */
+ mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv, true);
+
+ /* Set the APLIC device MSI domain if not available */
+ if (!dev_get_msi_domain(dev)) {
+ /*
+ * The device MSI domain for OF devices is only set at the
+ * time of populating/creating OF device. If the device MSI
+ * domain is discovered later after the OF device is created
+ * then we need to set it explicitly before using any platform
+ * MSI functions.
+ *
+ * In case of APLIC device, the parent MSI domain is always
+ * IMSIC and the IMSIC MSI domains are created later through
+ * the platform driver probing so we set it explicitly here.
+ */
+ if (is_of_node(dev->fwnode))
+ of_msi_configure(dev, to_of_node(dev->fwnode));
+ }
+
+ /* Create irq domain instance for the APLIC MSI-mode */
+ irqdomain = platform_msi_create_device_domain(
+ dev, priv->nr_irqs + 1,
+ aplic_msi_write_msg,
+ &aplic_msi_irqdomain_ops,
+ priv);
+ if (!irqdomain) {
+ dev_err(dev, "failed to create MSI irq domain\n");
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+ dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
+ priv->nr_irqs, &pa);
+
+ return 0;
+}
--
2.34.1

2023-10-23 17:31:25

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 13/14] RISC-V: Select APLIC and IMSIC drivers

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
arch/riscv/Kconfig | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d607ab0f7c6d..45c660f1219d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -153,6 +153,8 @@ config RISCV
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
select RISCV_ALTERNATIVE if !XIP_KERNEL
+ select RISCV_APLIC
+ select RISCV_IMSIC
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
--
2.34.1

2023-10-23 17:31:30

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 04/14] irqchip/sifive-plic: Convert PLIC driver into a platform driver

The PLIC driver does not require very early initialization so let
us convert it into a platform driver.

As part of the conversion, the PLIC probing undergoes the following
changes:
1. Use dev_info(), dev_err() and dev_warn() instead of pr_info(),
pr_err() and pr_warn()
2. Use devm_xyz() APIs wherever applicable
3. PLIC is now probed after CPUs are brought-up so we have to
setup cpuhp state after context handler of all online CPUs
are initialized otherwise we see crash on multi-socket systems

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-sifive-plic.c | 239 ++++++++++++++++++------------
1 file changed, 148 insertions(+), 91 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 5b7bc4fd9517..c8f8a8cdcce1 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -3,7 +3,6 @@
* Copyright (C) 2017 SiFive
* Copyright (C) 2018 Christoph Hellwig
*/
-#define pr_fmt(fmt) "plic: " fmt
#include <linux/cpu.h>
#include <linux/interrupt.h>
#include <linux/io.h>
@@ -64,6 +63,7 @@
#define PLIC_QUIRK_EDGE_INTERRUPT 0

struct plic_priv {
+ struct device *dev;
struct cpumask lmask;
struct irq_domain *irqdomain;
void __iomem *regs;
@@ -85,7 +85,6 @@ struct plic_handler {
struct plic_priv *priv;
};
static int plic_parent_irq __ro_after_init;
-static bool plic_cpuhp_setup_done __ro_after_init;
static DEFINE_PER_CPU(struct plic_handler, plic_handlers);

static int plic_irq_set_type(struct irq_data *d, unsigned int type);
@@ -371,7 +370,8 @@ static void plic_handle_irq(struct irq_desc *desc)
int err = generic_handle_domain_irq(handler->priv->irqdomain,
hwirq);
if (unlikely(err))
- pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
+ dev_warn_ratelimited(handler->priv->dev,
+ "can't find mapping for hwirq %lu\n",
hwirq);
}

@@ -406,57 +406,126 @@ static int plic_starting_cpu(unsigned int cpu)
return 0;
}

-static int __init __plic_init(struct device_node *node,
- struct device_node *parent,
- unsigned long plic_quirks)
+static const struct of_device_id plic_match[] = {
+ { .compatible = "sifive,plic-1.0.0" },
+ { .compatible = "riscv,plic0" },
+ { .compatible = "andestech,nceplic100",
+ .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+ { .compatible = "thead,c900-plic",
+ .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+ {}
+};
+
+static int plic_parse_nr_irqs_and_contexts(struct platform_device *pdev,
+ u32 *nr_irqs, u32 *nr_contexts)
{
- int error = 0, nr_contexts, nr_handlers = 0, i;
- u32 nr_irqs;
- struct plic_priv *priv;
+ struct device *dev = &pdev->dev;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_property_read_u32(to_of_node(dev->fwnode),
+ "riscv,ndev", nr_irqs);
+ if (rc) {
+ dev_err(dev, "riscv,ndev property not available\n");
+ return rc;
+ }
+
+ *nr_contexts = of_irq_count(to_of_node(dev->fwnode));
+ if (WARN_ON(!(*nr_contexts))) {
+ dev_err(dev, "no PLIC context available\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int plic_parse_context_parent_hwirq(struct platform_device *pdev,
+ u32 context, u32 *parent_hwirq,
+ unsigned long *parent_hartid)
+{
+ struct device *dev = &pdev->dev;
+ struct of_phandle_args parent;
+ int rc;
+
+ /*
+ * Currently, only OF fwnode is supported so extend this
+ * function for ACPI support.
+ */
+ if (!is_of_node(dev->fwnode))
+ return -EINVAL;
+
+ rc = of_irq_parse_one(to_of_node(dev->fwnode), context, &parent);
+ if (rc)
+ return rc;
+
+ rc = riscv_of_parent_hartid(parent.np, parent_hartid);
+ if (rc)
+ return rc;
+
+ *parent_hwirq = parent.args[0];
+ return 0;
+}
+
+static int plic_probe(struct platform_device *pdev)
+{
+ int rc, nr_contexts, nr_handlers = 0, i, cpu;
+ unsigned long plic_quirks = 0, hartid;
+ struct device *dev = &pdev->dev;
struct plic_handler *handler;
- unsigned int cpu;
+ u32 nr_irqs, parent_hwirq;
+ struct irq_domain *domain;
+ struct plic_priv *priv;
+ irq_hw_number_t hwirq;
+ struct resource *res;
+ bool cpuhp_setup;
+
+ if (is_of_node(dev->fwnode)) {
+ const struct of_device_id *id;
+
+ id = of_match_node(plic_match, to_of_node(dev->fwnode));
+ if (id)
+ plic_quirks = (unsigned long)id->data;
+ }

- priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
-
+ priv->dev = dev;
priv->plic_quirks = plic_quirks;

- priv->regs = of_iomap(node, 0);
- if (WARN_ON(!priv->regs)) {
- error = -EIO;
- goto out_free_priv;
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(dev, "failed to get MMIO resource\n");
+ return -EINVAL;
+ }
+ priv->regs = devm_ioremap(dev, res->start, resource_size(res));
+ if (!priv->regs) {
+ dev_err(dev, "failed map MMIO registers\n");
+ return -EIO;
}

- error = -EINVAL;
- of_property_read_u32(node, "riscv,ndev", &nr_irqs);
- if (WARN_ON(!nr_irqs))
- goto out_iounmap;
-
+ rc = plic_parse_nr_irqs_and_contexts(pdev, &nr_irqs, &nr_contexts);
+ if (rc) {
+ dev_err(dev, "failed to parse irqs and contexts\n");
+ return rc;
+ }
priv->nr_irqs = nr_irqs;

- priv->prio_save = bitmap_alloc(nr_irqs, GFP_KERNEL);
+ priv->prio_save = devm_bitmap_zalloc(dev, nr_irqs, GFP_KERNEL);
if (!priv->prio_save)
- goto out_free_priority_reg;
-
- nr_contexts = of_irq_count(node);
- if (WARN_ON(!nr_contexts))
- goto out_free_priority_reg;
-
- error = -ENOMEM;
- priv->irqdomain = irq_domain_add_linear(node, nr_irqs + 1,
- &plic_irqdomain_ops, priv);
- if (WARN_ON(!priv->irqdomain))
- goto out_free_priority_reg;
+ return -ENOMEM;

for (i = 0; i < nr_contexts; i++) {
- struct of_phandle_args parent;
- irq_hw_number_t hwirq;
- int cpu;
- unsigned long hartid;
-
- if (of_irq_parse_one(node, i, &parent)) {
- pr_err("failed to parse parent for context %d.\n", i);
+ rc = plic_parse_context_parent_hwirq(pdev, i,
+ &parent_hwirq, &hartid);
+ if (rc) {
+ dev_warn(dev, "hwirq for context%d not found\n", i);
continue;
}

@@ -464,7 +533,7 @@ static int __init __plic_init(struct device_node *node,
* Skip contexts other than external interrupts for our
* privilege level.
*/
- if (parent.args[0] != RV_IRQ_EXT) {
+ if (parent_hwirq != RV_IRQ_EXT) {
/* Disable S-mode enable bits if running in M-mode. */
if (IS_ENABLED(CONFIG_RISCV_M_MODE)) {
void __iomem *enable_base = priv->regs +
@@ -477,21 +546,17 @@ static int __init __plic_init(struct device_node *node,
continue;
}

- error = riscv_of_parent_hartid(parent.np, &hartid);
- if (error < 0) {
- pr_warn("failed to parse hart ID for context %d.\n", i);
- continue;
- }
-
cpu = riscv_hartid_to_cpuid(hartid);
if (cpu < 0) {
- pr_warn("Invalid cpuid for context %d\n", i);
+ dev_warn(dev, "Invalid cpuid for context %d\n", i);
continue;
}

/* Find parent domain and register chained handler */
- if (!plic_parent_irq && irq_find_host(parent.np)) {
- plic_parent_irq = irq_of_parse_and_map(node, i);
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!plic_parent_irq && domain) {
+ plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
if (plic_parent_irq)
irq_set_chained_handler(plic_parent_irq,
plic_handle_irq);
@@ -504,7 +569,7 @@ static int __init __plic_init(struct device_node *node,
*/
handler = per_cpu_ptr(&plic_handlers, cpu);
if (handler->present) {
- pr_warn("handler already present for context %d.\n", i);
+ dev_warn(dev, "handler already present for context%d.\n", i);
plic_set_threshold(handler, PLIC_DISABLE_THRESHOLD);
goto done;
}
@@ -518,10 +583,13 @@ static int __init __plic_init(struct device_node *node,
i * CONTEXT_ENABLE_SIZE;
handler->priv = priv;

- handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),
- sizeof(*handler->enable_save), GFP_KERNEL);
+ handler->enable_save = devm_kcalloc(dev,
+ DIV_ROUND_UP(nr_irqs, 32),
+ sizeof(*handler->enable_save),
+ GFP_KERNEL);
if (!handler->enable_save)
- goto out_free_enable_reg;
+ return -ENOMEM;
+
done:
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
plic_toggle(handler, hwirq, 0);
@@ -531,52 +599,41 @@ static int __init __plic_init(struct device_node *node,
nr_handlers++;
}

+ priv->irqdomain = irq_domain_create_linear(dev->fwnode, nr_irqs + 1,
+ &plic_irqdomain_ops, priv);
+ if (WARN_ON(!priv->irqdomain))
+ return -ENOMEM;
+
/*
* We can have multiple PLIC instances so setup cpuhp state
- * and register syscore operations only when context handler
- * for current/boot CPU is present.
+ * and register syscore operations only after context handlers
+ * of all online CPUs are initialized.
*/
- handler = this_cpu_ptr(&plic_handlers);
- if (handler->present && !plic_cpuhp_setup_done) {
+ cpuhp_setup = true;
+ for_each_online_cpu(cpu) {
+ handler = per_cpu_ptr(&plic_handlers, cpu);
+ if (!handler->present) {
+ cpuhp_setup = false;
+ break;
+ }
+ }
+ if (cpuhp_setup) {
cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
"irqchip/sifive/plic:starting",
plic_starting_cpu, plic_dying_cpu);
register_syscore_ops(&plic_irq_syscore_ops);
- plic_cpuhp_setup_done = true;
}

- pr_info("%pOFP: mapped %d interrupts with %d handlers for"
- " %d contexts.\n", node, nr_irqs, nr_handlers, nr_contexts);
+ dev_info(dev, "mapped %d interrupts with %d handlers for"
+ " %d contexts.\n", nr_irqs, nr_handlers, nr_contexts);
return 0;
-
-out_free_enable_reg:
- for_each_cpu(cpu, cpu_present_mask) {
- handler = per_cpu_ptr(&plic_handlers, cpu);
- kfree(handler->enable_save);
- }
-out_free_priority_reg:
- kfree(priv->prio_save);
-out_iounmap:
- iounmap(priv->regs);
-out_free_priv:
- kfree(priv);
- return error;
}

-static int __init plic_init(struct device_node *node,
- struct device_node *parent)
-{
- return __plic_init(node, parent, 0);
-}
-
-IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
-IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
-
-static int __init plic_edge_init(struct device_node *node,
- struct device_node *parent)
-{
- return __plic_init(node, parent, BIT(PLIC_QUIRK_EDGE_INTERRUPT));
-}
-
-IRQCHIP_DECLARE(andestech_nceplic100, "andestech,nceplic100", plic_edge_init);
-IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", plic_edge_init);
+static struct platform_driver plic_driver = {
+ .driver = {
+ .name = "riscv-plic",
+ .of_match_table = plic_match,
+ },
+ .probe = plic_probe,
+};
+builtin_platform_driver(plic_driver);
--
2.34.1

2023-10-23 17:31:36

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

The Linux PCI framework requires it's own dedicated MSI irqdomain so
let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 7 +++
drivers/irqchip/irq-riscv-imsic-platform.c | 51 ++++++++++++++++++++++
drivers/irqchip/irq-riscv-imsic-state.h | 1 +
3 files changed, 59 insertions(+)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index bdd80716114d..c1d69b418dfb 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -552,6 +552,13 @@ config RISCV_IMSIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ

+config RISCV_IMSIC_PCI
+ bool
+ depends on RISCV_IMSIC
+ depends on PCI
+ depends on PCI_MSI
+ default RISCV_IMSIC
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
index 23d286cb017e..cdb659401199 100644
--- a/drivers/irqchip/irq-riscv-imsic-platform.c
+++ b/drivers/irqchip/irq-riscv-imsic-platform.c
@@ -13,6 +13,7 @@
#include <linux/irqdomain.h>
#include <linux/module.h>
#include <linux/msi.h>
+#include <linux/pci.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
@@ -215,6 +216,42 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
#endif
};

+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+ pci_msi_unmask_irq(d);
+ irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+ .name = "IMSIC-PCI",
+ .irq_mask = imsic_pci_mask_irq,
+ .irq_unmask = imsic_pci_unmask_irq,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = imsic_irq_set_affinity,
+#endif
+ .irq_eoi = irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+ .ops = &imsic_pci_domain_ops,
+ .chip = &imsic_pci_irq_chip,
+};
+
+#endif
+
static struct irq_chip imsic_plat_irq_chip = {
.name = "IMSIC-PLAT",
#ifdef CONFIG_SMP
@@ -243,6 +280,18 @@ static int imsic_irq_domains_init(struct fwnode_handle *fwnode)
}
irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);

+#ifdef CONFIG_RISCV_IMSIC_PCI
+ /* Create PCI MSI domain */
+ imsic->pci_domain = pci_msi_create_irq_domain(fwnode,
+ &imsic_pci_domain_info,
+ imsic->base_domain);
+ if (!imsic->pci_domain) {
+ pr_err("%pfwP: failed to create IMSIC PCI domain\n", fwnode);
+ irq_domain_remove(imsic->base_domain);
+ return -ENOMEM;
+ }
+#endif
+
/* Create Platform MSI domain */
imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
&imsic_plat_domain_info,
@@ -250,6 +299,8 @@ static int imsic_irq_domains_init(struct fwnode_handle *fwnode)
if (!imsic->plat_domain) {
pr_err("%pfwP: failed to create IMSIC platform domain\n",
fwnode);
+ if (imsic->pci_domain)
+ irq_domain_remove(imsic->pci_domain);
irq_domain_remove(imsic->base_domain);
return -ENOMEM;
}
diff --git a/drivers/irqchip/irq-riscv-imsic-state.h b/drivers/irqchip/irq-riscv-imsic-state.h
index 82911b8b08b4..8d209e77432e 100644
--- a/drivers/irqchip/irq-riscv-imsic-state.h
+++ b/drivers/irqchip/irq-riscv-imsic-state.h
@@ -67,6 +67,7 @@ struct imsic_priv {

/* IRQ domains (created by platform driver) */
struct irq_domain *base_domain;
+ struct irq_domain *pci_domain;
struct irq_domain *plat_domain;
};

--
2.34.1

2023-10-23 17:32:01

by Anup Patel

[permalink] [raw]
Subject: [PATCH v11 08/14] irqchip/riscv-imsic: Add support for platform MSI irqdomain

The Linux platform MSI support requires a platform MSI irqdomain so
let us add a platform irqchip driver for RISC-V IMSIC which provides
a base IRQ domain and platform MSI domain. This driver assumes that
the IMSIC state is already initialized by the IMSIC early driver.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Makefile | 2 +-
drivers/irqchip/irq-riscv-imsic-platform.c | 309 +++++++++++++++++++++
2 files changed, 310 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-imsic-platform.c

diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index d714724387ce..abca445a3229 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,7 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
-obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
new file mode 100644
index 000000000000..23d286cb017e
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic-platform.c
@@ -0,0 +1,309 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+
+#include "irq-riscv-imsic-state.h"
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+ unsigned int guest_index,
+ phys_addr_t *out_msi_pa)
+{
+ struct imsic_global_config *global;
+ struct imsic_local_config *local;
+
+ global = &imsic->global;
+ local = per_cpu_ptr(global->local, cpu);
+
+ if (BIT(global->guest_index_bits) <= guest_index)
+ return -EINVAL;
+
+ if (out_msi_pa)
+ *out_msi_pa = local->msi_pa +
+ (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+ return 0;
+}
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+ imsic_vector_mask(irq_data_get_irq_chip_data(d));
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+ imsic_vector_unmask(irq_data_get_irq_chip_data(d));
+}
+
+static void imsic_irq_compose_vector_msg(struct imsic_vector *vec,
+ struct msi_msg *msg)
+{
+ phys_addr_t msi_addr;
+ int err;
+
+ if (WARN_ON(vec == NULL))
+ return;
+
+ err = imsic_cpu_page_phys(vec->cpu, 0, &msi_addr);
+ if (WARN_ON(err))
+ return;
+
+ msg->address_hi = upper_32_bits(msi_addr);
+ msg->address_lo = lower_32_bits(msi_addr);
+ msg->data = IMSIC_VECTOR_BASE_LOCAL_ID(vec);
+}
+
+static void imsic_irq_compose_msg(struct irq_data *d, struct msi_msg *msg)
+{
+ imsic_irq_compose_vector_msg(irq_data_get_irq_chip_data(d), msg);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_msi_update_msg(struct irq_data *d, struct imsic_vector *vec)
+{
+ struct msi_msg msg[2] = { [1] = { }, };
+
+ imsic_irq_compose_vector_msg(vec, msg);
+ irq_data_get_irq_chip(d)->irq_write_msi_msg(d, msg);
+}
+
+static int imsic_irq_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val,
+ bool force)
+{
+ struct imsic_vector *old_vec, *new_vec;
+ struct irq_data *pd = d->parent_data;
+ unsigned int i, virq, hwirq;
+
+ old_vec = irq_data_get_irq_chip_data(pd);
+ if (WARN_ON(old_vec == NULL))
+ return -ENOENT;
+
+ /* Find-out base virq, hwirq and order of the old vector */
+ hwirq = IMSIC_VECTOR_BASE_HWIRQ(old_vec);
+ virq = pd->irq - (old_vec->hwirq - hwirq);
+
+ /* Ensure old vector points to the first entry */
+ if (old_vec->hwirq != hwirq) {
+ pd = irq_domain_get_irq_data(imsic->base_domain, virq);
+ old_vec = irq_data_get_irq_chip_data(pd);
+ }
+
+ /* Get a new vector on the desired set of CPUs */
+ new_vec = imsic_vector_alloc(hwirq, mask_val, old_vec->order);
+ if (!new_vec)
+ return -ENOSPC;
+
+ /* If old vector belongs to the desired CPU then do nothing */
+ if (old_vec->cpu == new_vec->cpu) {
+ imsic_vector_free(new_vec);
+ return IRQ_SET_MASK_OK_DONE;
+ }
+
+ /* Point device to the new vector */
+ imsic_msi_update_msg(d, new_vec);
+
+ /* Update irq descriptors */
+ for (i = 0; i < BIT(old_vec->order); i++) {
+ pd = irq_domain_get_irq_data(imsic->base_domain, virq + i);
+
+ /* Save the new vector entry in irq descriptor*/
+ pd->chip_data = new_vec + i;
+
+ /* Update effective affinity of parent irq data */
+ irq_data_update_effective_affinity(pd,
+ cpumask_of(new_vec->cpu));
+ }
+
+ /* Move state of the old vector to the new vector */
+ imsic_vector_move(old_vec, new_vec);
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+ .name = "IMSIC-BASE",
+ .irq_mask = imsic_irq_mask,
+ .irq_unmask = imsic_irq_unmask,
+ .irq_compose_msi_msg = imsic_irq_compose_msg,
+ .flags = IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *args)
+{
+ struct imsic_vector *vec;
+ int i, hwirq;
+
+ hwirq = imsic_hwirqs_alloc(get_count_order(nr_irqs));
+ if (hwirq < 0)
+ return hwirq;
+
+ vec = imsic_vector_alloc(hwirq, cpu_online_mask,
+ get_count_order(nr_irqs));
+ if (!vec) {
+ imsic_hwirqs_free(hwirq, get_count_order(nr_irqs));
+ return -ENOSPC;
+ }
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &imsic_irq_base_chip, vec + i,
+ handle_simple_irq, NULL, NULL);
+ irq_set_noprobe(virq + i);
+ irq_set_affinity(virq + i, cpu_online_mask);
+ /*
+ * IMSIC does not implement irq_disable() so Linux interrupt
+ * subsystem will take a lazy approach for disabling an IMSIC
+ * interrupt. This means IMSIC interrupts are left unmasked
+ * upon system suspend and interrupts are not processed
+ * immediately upon system wake up. To tackle this, we disable
+ * the lazy approach for all IMSIC interrupts.
+ */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+
+ imsic_vector_free(irq_data_get_irq_chip_data(d));
+ imsic_hwirqs_free(d->hwirq, get_count_order(nr_irqs));
+ irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+static void imsic_irq_debug_show(struct seq_file *m, struct irq_domain *d,
+ struct irq_data *irqd, int ind)
+{
+ if (!irqd) {
+ imsic_vector_debug_show_summary(m, ind);
+ return;
+ }
+
+ imsic_vector_debug_show(m, irq_data_get_irq_chip_data(irqd), ind);
+}
+#endif
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+ .alloc = imsic_irq_domain_alloc,
+ .free = imsic_irq_domain_free,
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+ .debug_show = imsic_irq_debug_show,
+#endif
+};
+
+static struct irq_chip imsic_plat_irq_chip = {
+ .name = "IMSIC-PLAT",
+#ifdef CONFIG_SMP
+ .irq_set_affinity = imsic_irq_set_affinity,
+#endif
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+ .ops = &imsic_plat_domain_ops,
+ .chip = &imsic_plat_irq_chip,
+};
+
+static int imsic_irq_domains_init(struct fwnode_handle *fwnode)
+{
+ /* Create Base IRQ domain */
+ imsic->base_domain = irq_domain_create_tree(fwnode,
+ &imsic_base_domain_ops, imsic);
+ if (!imsic->base_domain) {
+ pr_err("%pfwP: failed to create IMSIC base domain\n",
+ fwnode);
+ return -ENOMEM;
+ }
+ irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);
+
+ /* Create Platform MSI domain */
+ imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
+ &imsic_plat_domain_info,
+ imsic->base_domain);
+ if (!imsic->plat_domain) {
+ pr_err("%pfwP: failed to create IMSIC platform domain\n",
+ fwnode);
+ irq_domain_remove(imsic->base_domain);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static int imsic_platform_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct imsic_global_config *global;
+ int rc;
+
+ if (!imsic) {
+ dev_err(dev, "early driver not probed\n");
+ return -ENODEV;
+ }
+
+ if (imsic->base_domain) {
+ dev_err(dev, "irq domain already created\n");
+ return -ENODEV;
+ }
+
+ global = &imsic->global;
+
+ /* Initialize IRQ and MSI domains */
+ rc = imsic_irq_domains_init(dev->fwnode);
+ if (rc) {
+ dev_err(dev, "failed to initialize IRQ and MSI domains\n");
+ return rc;
+ }
+
+ dev_info(dev, " hart-index-bits: %d, guest-index-bits: %d\n",
+ global->hart_index_bits, global->guest_index_bits);
+ dev_info(dev, " group-index-bits: %d, group-index-shift: %d\n",
+ global->group_index_bits, global->group_index_shift);
+ dev_info(dev, " per-CPU IDs %d at base PPN %pa\n",
+ global->nr_ids, &global->base_addr);
+ dev_info(dev, " total %d interrupts available\n",
+ imsic->nr_hwirqs);
+
+ return 0;
+}
+
+static const struct of_device_id imsic_platform_match[] = {
+ { .compatible = "riscv,imsics" },
+ {}
+};
+
+static struct platform_driver imsic_platform_driver = {
+ .driver = {
+ .name = "riscv-imsic",
+ .of_match_table = imsic_platform_match,
+ },
+ .probe = imsic_platform_probe,
+};
+builtin_platform_driver(imsic_platform_driver);
--
2.34.1

2023-10-24 05:32:12

by Sunil V L

[permalink] [raw]
Subject: Re: [PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode

Hi Anup,

On Mon, Oct 23, 2023 at 10:57:58PM +0530, Anup Patel wrote:
> The RISC-V advanced platform-level interrupt controller (APLIC) has
> two modes of operation: 1) Direct mode and 2) MSI mode.
> (For more details, refer https://github.com/riscv/riscv-aia)
>
> In APLIC MSI-mode, wired interrupts are forwared as message signaled
> interrupts (MSIs) to CPUs via IMSIC.
>
> We extend the existing APLIC irqchip driver to support MSI-mode for
> RISC-V platforms having both wired interrupts and MSIs.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
[...]
> +int aplic_msi_setup(struct device *dev, void __iomem *regs)
> +{
> + const struct imsic_global_config *imsic_global;
> + struct irq_domain *irqdomain;
> + struct aplic_priv *priv;
> + struct aplic_msicfg *mc;
> + phys_addr_t pa;
> + int rc;
> +
> + priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> +
> + rc = aplic_setup_priv(priv, dev, regs);
> + if (!priv) {
This should check rc instead of priv.

> + dev_err(dev, "failed to create APLIC context\n");
> + return rc;
> + }
> + mc = &priv->msicfg;
> +
> + /*
> + * The APLIC outgoing MSI config registers assume target MSI
> + * controller to be RISC-V AIA IMSIC controller.
> + */
> + imsic_global = imsic_get_global_config();
> + if (!imsic_global) {
> + dev_err(dev, "IMSIC global config not found\n");
> + return -ENODEV;
For all error return paths, priv should be freed.

Thanks,
Sunil

2023-10-24 09:25:56

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Mon, Oct 23, 2023 at 10:57:53PM +0530, Anup Patel wrote:

> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> +void imsic_vector_debug_show(struct seq_file *m,
> + struct imsic_vector *vec, int ind)
> +{
> + unsigned int mcpu = 0, mlocal_id = 0;
> + struct imsic_local_priv *lpriv;
> + bool move_in_progress = false;
> + struct imsic_vector *mvec;
> + bool is_enabled = false;
> + unsigned long flags;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + if (test_bit(vec->local_id, lpriv->ids_enabled_bitmap))
> + is_enabled = true;
> + mvec = lpriv->ids_move[vec->local_id];
> + if (mvec) {
> + move_in_progress = true;
> + mcpu = mvec->cpu;
> + mlocal_id = mvec->local_id;
> + }
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> +
> + seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
> + seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
> + seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
> + (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);

> + seq_printf(m, "%*sis_enabled : %5u\n", ind, "",
> + (move_in_progress) ? 1 : 0);

gcc & clang report:
drivers/irqchip/irq-riscv-imsic-state.c:288:14: warning: variable 'is_enabled' set but not used [-Wunused-but-set-variable]

This looks to be a copy-pasta issue, and the move_in_progress here
should be is_enabled?

> + seq_printf(m, "%*sis_move_pending : %5u\n", ind, "",
> + (move_in_progress) ? 1 : 0);
> + if (move_in_progress) {
> + seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mcpu);
> + seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mlocal_id);
> + }
> +}


Attachments:
(No filename) (1.73 kB)
signature.asc (235.00 B)
Download all attachments

2023-10-24 11:55:46

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 01/14] RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs

Anup Patel <[email protected]> writes:

> The riscv_of_processor_hartid() used by riscv_of_parent_hartid() fails
> for HARTs disabled in the DT. This results in the following warning
> thrown by the RISC-V INTC driver for the E-core on SiFive boards:
>
> [ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
>
> The riscv_of_parent_hartid() is only expected to read the hartid from
> the DT so we should directly call of_get_cpu_hwid() instead of calling
> riscv_of_processor_hartid().
>
> Fixes: ad635e723e17 ("riscv: cpu: Add 64bit hartid support on RV64")

Patch 1 and 3: These fixes are stand alone, and doesn't have to be part
of the series.

Wouldn't it be better to pull these out of the long-going series, and
try to get in the fixes ASAP?


Björn

2023-10-24 12:07:48

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 01/14] RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs

On Tue, Oct 24, 2023 at 5:25 PM Björn Töpel <[email protected]> wrote:
>
> Anup Patel <[email protected]> writes:
>
> > The riscv_of_processor_hartid() used by riscv_of_parent_hartid() fails
> > for HARTs disabled in the DT. This results in the following warning
> > thrown by the RISC-V INTC driver for the E-core on SiFive boards:
> >
> > [ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
> >
> > The riscv_of_parent_hartid() is only expected to read the hartid from
> > the DT so we should directly call of_get_cpu_hwid() instead of calling
> > riscv_of_processor_hartid().
> >
> > Fixes: ad635e723e17 ("riscv: cpu: Add 64bit hartid support on RV64")
>
> Patch 1 and 3: These fixes are stand alone, and doesn't have to be part
> of the series.
>
> Wouldn't it be better to pull these out of the long-going series, and
> try to get in the fixes ASAP?

Yes, that is correct. In fact, PATCH2 can also be taken for Linux-6.7.

I suggest PATCH1 to PATCH3 (3 patches) be taken for Linux-6.7
since the merge window is pretty close.

Regards,
Anup

2023-10-24 12:09:13

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Oct 24, 2023 at 2:55 PM Conor Dooley <[email protected]> wrote:
>
> On Mon, Oct 23, 2023 at 10:57:53PM +0530, Anup Patel wrote:
>
> > +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> > +void imsic_vector_debug_show(struct seq_file *m,
> > + struct imsic_vector *vec, int ind)
> > +{
> > + unsigned int mcpu = 0, mlocal_id = 0;
> > + struct imsic_local_priv *lpriv;
> > + bool move_in_progress = false;
> > + struct imsic_vector *mvec;
> > + bool is_enabled = false;
> > + unsigned long flags;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> > + if (test_bit(vec->local_id, lpriv->ids_enabled_bitmap))
> > + is_enabled = true;
> > + mvec = lpriv->ids_move[vec->local_id];
> > + if (mvec) {
> > + move_in_progress = true;
> > + mcpu = mvec->cpu;
> > + mlocal_id = mvec->local_id;
> > + }
> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> > +
> > + seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
> > + seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
> > + seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
> > + (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
>
> > + seq_printf(m, "%*sis_enabled : %5u\n", ind, "",
> > + (move_in_progress) ? 1 : 0);
>
> gcc & clang report:
> drivers/irqchip/irq-riscv-imsic-state.c:288:14: warning: variable 'is_enabled' set but not used [-Wunused-but-set-variable]
>
> This looks to be a copy-pasta issue, and the move_in_progress here
> should be is_enabled?

Thanks for catching. Strangely, I did not see this warning with
the toolchain which I use.

I will fix it in the next patch revision.

Regards,
Anup

>
> > + seq_printf(m, "%*sis_move_pending : %5u\n", ind, "",
> > + (move_in_progress) ? 1 : 0);
> > + if (move_in_progress) {
> > + seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mcpu);
> > + seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mlocal_id);
> > + }
> > +}

2023-10-24 12:18:05

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v11 05/14] irqchip/riscv-intc: Add support for RISC-V AIA

On Mon, Oct 23, 2023 at 10:57:51PM +0530, Anup Patel wrote:
> The RISC-V advanced interrupt architecture (AIA) extends the per-HART
> local interrupts in following ways:
> 1. Minimum 64 local interrupts for both RV32 and RV64
> 2. Ability to process multiple pending local interrupts in same
> interrupt handler
> 3. Priority configuration for each local interrupts
> 4. Special CSRs to configure/access the per-HART MSI controller
>
> We add support for #1 and #2 described above in the RISC-V intc driver.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/irq-riscv-intc.c | 34 ++++++++++++++++++++++++++------
> 1 file changed, 28 insertions(+), 6 deletions(-)
>

Reviewed-by: Andrew Jones <[email protected]>

2023-10-24 12:31:10

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v11 06/14] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller

On Mon, Oct 23, 2023 at 10:57:52PM +0530, Anup Patel wrote:
> We add DT bindings document for the RISC-V incoming MSI controller
> (IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
> specification.
>
> Signed-off-by: Anup Patel <[email protected]>
> Reviewed-by: Conor Dooley <[email protected]>
> Acked-by: Krzysztof Kozlowski <[email protected]>
> ---
> .../interrupt-controller/riscv,imsics.yaml | 172 ++++++++++++++++++
> 1 file changed, 172 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> new file mode 100644
> index 000000000000..84976f17a4a1
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
> @@ -0,0 +1,172 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: RISC-V Incoming MSI Controller (IMSIC)
> +
> +maintainers:
> + - Anup Patel <[email protected]>
> +
> +description: |
> + The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
> + MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
> + AIA specification can be found at https://github.com/riscv/riscv-aia.
> +
> + The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
> + for each privilege level (machine or supervisor). The configuration of
> + a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO

an IMSIC

> + space to receive MSIs from devices. Each IMSIC interrupt file supports a
> + fixed number of interrupt identities (to distinguish MSIs from devices)
> + which is same for given privilege level across CPUs (or HARTs).

which is the same for a given

> +
> + The device tree of a RISC-V platform will have one IMSIC device tree node
> + for each privilege level (machine or supervisor) which collectively describe
> + IMSIC interrupt files at that privilege level across CPUs (or HARTs).

s/at that privilege level/for their respective privilege levels/

> +
> + The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
> + follows a particular scheme defined by the RISC-V AIA specification. A IMSIC

An IMSIC

> + group is a set of IMSIC interrupt files co-located in MMIO space and we can
> + have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
> + RISC-V platform. The MSI target address of a IMSIC interrupt file at given

an IMSIC interrupt file at a given

> + privilege level (machine or supervisor) encodes group index, HART index,
> + and guest index (shown below).
> +
> + XLEN-1 > (HART Index MSB) 12 0
> + | | | |
> + -------------------------------------------------------------
> + |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
> + -------------------------------------------------------------
> +
> +allOf:
> + - $ref: /schemas/interrupt-controller.yaml#
> + - $ref: /schemas/interrupt-controller/msi-controller.yaml#
> +
> +properties:
> + compatible:
> + items:
> + - enum:
> + - qemu,imsics
> + - const: riscv,imsics
> +
> + reg:
> + minItems: 1
> + maxItems: 16384
> + description:
> + Base address of each IMSIC group.
> +
> + interrupt-controller: true
> +
> + "#interrupt-cells":
> + const: 0
> +
> + msi-controller: true
> +
> + "#msi-cells":
> + const: 0
> +
> + interrupts-extended:
> + minItems: 1
> + maxItems: 16384
> + description:
> + This property represents the set of CPUs (or HARTs) for which given

which a given

> + device tree node describes the IMSIC interrupt files. Each node pointed
> + to should be a riscv,cpu-intc node, which has a CPU node (i.e. RISC-V
> + HART) as parent.

as its parent

> +
> + riscv,num-ids:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 63
> + maximum: 2047
> + description:
> + Number of interrupt identities supported by IMSIC interrupt file.

by an IMSIC interrupt file

> +
> + riscv,num-guest-ids:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 63
> + maximum: 2047
> + description:
> + Number of interrupt identities are supported by IMSIC guest interrupt

which are supported by an IMSIC guest interrupt file

> + file. When not specified it is assumed to be same as specified by the

the same

> + riscv,num-ids property.
> +
> + riscv,guest-index-bits:
> + minimum: 0
> + maximum: 7
> + default: 0
> + description:
> + Number of guest index bits in the MSI target address.
> +
> + riscv,hart-index-bits:
> + minimum: 0
> + maximum: 15
> + description:
> + Number of HART index bits in the MSI target address. When not
> + specified it is calculated based on the interrupts-extended property.
> +
> + riscv,group-index-bits:
> + minimum: 0
> + maximum: 7
> + default: 0
> + description:
> + Number of group index bits in the MSI target address.
> +
> + riscv,group-index-shift:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 0
> + maximum: 55
> + default: 24
> + description:
> + The least significant bit position of the group index bits in the
> + MSI target address.
> +
> +required:
> + - compatible
> + - reg
> + - interrupt-controller
> + - msi-controller
> + - "#msi-cells"
> + - interrupts-extended
> + - riscv,num-ids
> +
> +unevaluatedProperties: false
> +
> +examples:
> + - |
> + // Example 1 (Machine-level IMSIC files with just one group):
> +
> + interrupt-controller@24000000 {
> + compatible = "qemu,imsics", "riscv,imsics";
> + interrupts-extended = <&cpu1_intc 11>,
> + <&cpu2_intc 11>,
> + <&cpu3_intc 11>,
> + <&cpu4_intc 11>;
> + reg = <0x28000000 0x4000>;
> + interrupt-controller;
> + #interrupt-cells = <0>;
> + msi-controller;
> + #msi-cells = <0>;
> + riscv,num-ids = <127>;
> + };
> +
> + - |
> + // Example 2 (Supervisor-level IMSIC files with two groups):
> +
> + interrupt-controller@28000000 {
> + compatible = "qemu,imsics", "riscv,imsics";
> + interrupts-extended = <&cpu1_intc 9>,
> + <&cpu2_intc 9>,
> + <&cpu3_intc 9>,
> + <&cpu4_intc 9>;
> + reg = <0x28000000 0x2000>, /* Group0 IMSICs */
> + <0x29000000 0x2000>; /* Group1 IMSICs */
> + interrupt-controller;
> + #interrupt-cells = <0>;
> + msi-controller;
> + #msi-cells = <0>;
> + riscv,num-ids = <127>;
> + riscv,group-index-bits = <1>;
> + riscv,group-index-shift = <24>;
> + };
> +...
> --
> 2.34.1
>

Thanks,
drew

2023-10-24 13:10:00

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

Anup Patel <[email protected]> writes:

> The Linux PCI framework requires it's own dedicated MSI irqdomain so
> let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Kconfig | 7 +++
> drivers/irqchip/irq-riscv-imsic-platform.c | 51 ++++++++++++++++++++++
> drivers/irqchip/irq-riscv-imsic-state.h | 1 +
> 3 files changed, 59 insertions(+)
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index bdd80716114d..c1d69b418dfb 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -552,6 +552,13 @@ config RISCV_IMSIC
> select IRQ_DOMAIN_HIERARCHY
> select GENERIC_MSI_IRQ
>
> +config RISCV_IMSIC_PCI
> + bool
> + depends on RISCV_IMSIC
> + depends on PCI
> + depends on PCI_MSI
> + default RISCV_IMSIC
> +
> config EXYNOS_IRQ_COMBINER
> bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
> index 23d286cb017e..cdb659401199 100644
> --- a/drivers/irqchip/irq-riscv-imsic-platform.c
> +++ b/drivers/irqchip/irq-riscv-imsic-platform.c
> @@ -13,6 +13,7 @@
> #include <linux/irqdomain.h>
> #include <linux/module.h>
> #include <linux/msi.h>
> +#include <linux/pci.h>
> #include <linux/platform_device.h>
> #include <linux/spinlock.h>
> #include <linux/smp.h>
> @@ -215,6 +216,42 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
> #endif
> };
>
> +#ifdef CONFIG_RISCV_IMSIC_PCI
> +
> +static void imsic_pci_mask_irq(struct irq_data *d)
> +{
> + pci_msi_mask_irq(d);
> + irq_chip_mask_parent(d);

I've asked this before, but I still don't get why you need to propagate
to the parent? Why isn't masking on PCI enough?


Björn

2023-10-24 13:10:27

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

Hi Anup!

Wow, I'm really happy to see that you're moving towards the 1-1 model!

Anup Patel <[email protected]> writes:

> The RISC-V advanced interrupt architecture (AIA) specification
> defines a new MSI controller called incoming message signalled
> interrupt controller (IMSIC) which manages MSI on per-HART (or
> per-CPU) basis. It also supports IPIs as software injected MSIs.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> Let us add an early irqchip driver for RISC-V IMSIC which sets
> up the IMSIC state and provide IPIs.

It would help (reviewers, and future bugfixers) if you add (here or in
the cover) what design decisions you've taken instead of just saying
that you're now supporting IMSIC.

> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Kconfig | 6 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-imsic-early.c | 235 ++++++
> drivers/irqchip/irq-riscv-imsic-state.c | 962 ++++++++++++++++++++++++
> drivers/irqchip/irq-riscv-imsic-state.h | 109 +++
> include/linux/irqchip/riscv-imsic.h | 87 +++
> 6 files changed, 1400 insertions(+)
> create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
> create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
> create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
> create mode 100644 include/linux/irqchip/riscv-imsic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index f7149d0f3d45..bdd80716114d 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -546,6 +546,12 @@ config SIFIVE_PLIC
> select IRQ_DOMAIN_HIERARCHY
> select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_IMSIC
> + bool
> + depends on RISCV
> + select IRQ_DOMAIN_HIERARCHY
> + select GENERIC_MSI_IRQ
> +
> config EXYNOS_IRQ_COMBINER
> bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index ffd945fe71aa..d714724387ce 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
> new file mode 100644
> index 000000000000..23f689ff5807
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic-early.c
> @@ -0,0 +1,235 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/module.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +
> +#include "irq-riscv-imsic-state.h"
> +
> +static int imsic_parent_irq;
> +
> +#ifdef CONFIG_SMP
> +static irqreturn_t imsic_local_sync_handler(int irq, void *data)
> +{
> + imsic_local_sync();
> + return IRQ_HANDLED;
> +}
> +
> +static void imsic_ipi_send(unsigned int cpu)
> +{
> + struct imsic_local_config *local =
> + per_cpu_ptr(imsic->global.local, cpu);

General nit for the series; There's a lot line breaks in the series. Try
to use the full 100 chars for a line.

> +
> + writel(IMSIC_IPI_ID, local->msi_va);

Do you need the barriers here? If so, please document. If not, use the
_releaxed() version.

> +}
> +
> +static void imsic_ipi_starting_cpu(void)
> +{
> + /* Enable IPIs for current CPU. */
> + __imsic_id_set_enable(IMSIC_IPI_ID);
> +
> + /* Enable virtual IPI used for IMSIC ID synchronization */
> + enable_percpu_irq(imsic->ipi_virq, 0);

Maybe pass IRQ_TYPE_NONE instead of 0, so it's clearer?

> +}
> +
> +static void imsic_ipi_dying_cpu(void)
> +{
> + /*
> + * Disable virtual IPI used for IMSIC ID synchronization so
> + * that we don't receive ID synchronization requests.
> + */
> + disable_percpu_irq(imsic->ipi_virq);
> +}
> +
> +static int __init imsic_ipi_domain_init(void)
> +{
> + int virq;
> +
> + /* Create IMSIC IPI multiplexing */
> + virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
> + if (virq <= 0)
> + return (virq < 0) ? virq : -ENOMEM;
> + imsic->ipi_virq = virq;
> +
> + /* First vIRQ is used for IMSIC ID synchronization */
> + virq = request_percpu_irq(imsic->ipi_virq, imsic_local_sync_handler,
> + "riscv-imsic-lsync", imsic->global.local);
> + if (virq)
> + return virq;
> + irq_set_status_flags(imsic->ipi_virq, IRQ_HIDDEN);
> + imsic->ipi_lsync_desc = irq_to_desc(imsic->ipi_virq);
> +
> + /* Set vIRQ range */
> + riscv_ipi_set_virq_range(imsic->ipi_virq + 1, IMSIC_NR_IPI - 1, true);
> +
> + /* Announce that IMSIC is providing IPIs */
> + pr_info("%pfwP: providing IPIs using interrupt %d\n",
> + imsic->fwnode, IMSIC_IPI_ID);
> +
> + return 0;
> +}
> +#else
> +static void imsic_ipi_starting_cpu(void)
> +{
> +}
> +
> +static void imsic_ipi_dying_cpu(void)
> +{
> +}
> +
> +static int __init imsic_ipi_domain_init(void)
> +{
> + return 0;
> +}
> +#endif
> +
> +/*
> + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> + * Linux interrupt number and let Linux IRQ subsystem handle it.
> + */
> +static void imsic_handle_irq(struct irq_desc *desc)
> +{
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + int err, cpu = smp_processor_id();
> + struct imsic_vector *vec;
> + unsigned long local_id;
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> + local_id = local_id >> TOPEI_ID_SHIFT;
> +
> + if (local_id == IMSIC_IPI_ID) {
> +#ifdef CONFIG_SMP
> + ipi_mux_process();
> +#endif
> + continue;
> + }
> +
> + if (unlikely(!imsic->base_domain))
> + continue;
> +
> + vec = imsic_vector_from_local_id(cpu, local_id);
> + if (!vec) {
> + pr_warn_ratelimited(
> + "vector not found for local ID 0x%lx\n",
> + local_id);
> + continue;
> + }
> +
> + err = generic_handle_domain_irq(imsic->base_domain,
> + vec->hwirq);
> + if (unlikely(err))
> + pr_warn_ratelimited(
> + "hwirq 0x%x mapping not found\n",
> + vec->hwirq);
> + }
> +
> + chained_irq_exit(chip, desc);
> +}
> +
> +static int imsic_starting_cpu(unsigned int cpu)
> +{
> + /* Enable per-CPU parent interrupt */
> + enable_percpu_irq(imsic_parent_irq,
> + irq_get_trigger_type(imsic_parent_irq));
> +
> + /* Setup IPIs */
> + imsic_ipi_starting_cpu();
> +
> + /*
> + * Interrupts identities might have been enabled/disabled while
> + * this CPU was not running so sync-up local enable/disable state.
> + */
> + imsic_local_sync();
> +
> + /* Enable local interrupt delivery */
> + imsic_local_delivery(true);
> +
> + return 0;
> +}
> +
> +static int imsic_dying_cpu(unsigned int cpu)
> +{
> + /* Cleanup IPIs */
> + imsic_ipi_dying_cpu();
> +
> + return 0;
> +}
> +
> +static int __init imsic_early_probe(struct fwnode_handle *fwnode)
> +{
> + int rc;
> + struct irq_domain *domain;
> +
> + /* Find parent domain and register chained handler */
> + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> + DOMAIN_BUS_ANY);
> + if (!domain) {
> + pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> + return -ENOENT;
> + }
> + imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> + if (!imsic_parent_irq) {
> + pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> + return -ENOENT;
> + }
> + irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> +
> + /* Initialize IPI domain */
> + rc = imsic_ipi_domain_init();
> + if (rc) {
> + pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> + return rc;
> + }
> +
> + /*
> + * Setup cpuhp state (must be done after setting imsic_parent_irq)
> + *
> + * Don't disable per-CPU IMSIC file when CPU goes offline
> + * because this affects IPI and the masking/unmasking of
> + * virtual IPIs is done via generic IPI-Mux
> + */
> + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> + "irqchip/riscv/imsic:starting",
> + imsic_starting_cpu, imsic_dying_cpu);
> +
> + return 0;
> +}
> +
> +static int __init imsic_early_dt_init(struct device_node *node,
> + struct device_node *parent)
> +{
> + int rc;
> + struct fwnode_handle *fwnode = &node->fwnode;
> +
> + /* Setup IMSIC state */
> + rc = imsic_setup_state(fwnode);
> + if (rc) {
> + pr_err("%pfwP: failed to setup state (error %d)\n",
> + fwnode, rc);
> + return rc;
> + }
> +
> + /* Do early setup of IPIs */
> + rc = imsic_early_probe(fwnode);
> + if (rc)
> + return rc;
> +
> + /* Ensure that OF platform device gets probed */
> + of_node_clear_flag(node, OF_POPULATED);
> + return 0;
> +}
> +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
> diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
> new file mode 100644
> index 000000000000..54465e47851c
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-imsic-state.c
> @@ -0,0 +1,962 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-imsic: " fmt
> +#include <linux/cpu.h>
> +#include <linux/bitmap.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +#include <linux/seq_file.h>
> +#include <linux/spinlock.h>
> +#include <linux/smp.h>
> +#include <asm/hwcap.h>
> +
> +#include "irq-riscv-imsic-state.h"
> +
> +#define IMSIC_DISABLE_EIDELIVERY 0
> +#define IMSIC_ENABLE_EIDELIVERY 1
> +#define IMSIC_DISABLE_EITHRESHOLD 1
> +#define IMSIC_ENABLE_EITHRESHOLD 0
> +
> +#define imsic_csr_write(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_write(CSR_IREG, __v); \
> +} while (0)
> +
> +#define imsic_csr_read(__c) \
> +({ \
> + unsigned long __v; \
> + csr_write(CSR_ISELECT, __c); \
> + __v = csr_read(CSR_IREG); \
> + __v; \
> +})
> +
> +#define imsic_csr_read_clear(__c, __v) \
> +({ \
> + unsigned long __r; \
> + csr_write(CSR_ISELECT, __c); \
> + __r = csr_read_clear(CSR_IREG, __v); \
> + __r; \
> +})
> +
> +#define imsic_csr_set(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_set(CSR_IREG, __v); \
> +} while (0)
> +
> +#define imsic_csr_clear(__c, __v) \
> +do { \
> + csr_write(CSR_ISELECT, __c); \
> + csr_clear(CSR_IREG, __v); \
> +} while (0)
> +
> +struct imsic_priv *imsic;
> +
> +const struct imsic_global_config *imsic_get_global_config(void)
> +{
> + return (imsic) ? &imsic->global : NULL;

Nit: No need for the parenthesis.

> +}
> +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> +
> +static bool __imsic_eix_read_clear(unsigned long id, bool pend)
> +{
> + unsigned long isel, imask;
> +
> + isel = id / BITS_PER_LONG;
> + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> + imask = BIT(id & (__riscv_xlen - 1));
> +
> + return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;
> +}
> +
> +#define __imsic_id_read_clear_enabled(__id) \
> + __imsic_eix_read_clear((__id), false)
> +#define __imsic_id_read_clear_pending(__id) \
> + __imsic_eix_read_clear((__id), true)
> +
> +void __imsic_eix_update(unsigned long base_id,
> + unsigned long num_id, bool pend, bool val)
> +{
> + unsigned long i, isel, ireg;
> + unsigned long id = base_id, last_id = base_id + num_id;
> +
> + while (id < last_id) {
> + isel = id / BITS_PER_LONG;
> + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;

Parenthesis nit.

> +
> + ireg = 0;
> + for (i = id & (__riscv_xlen - 1);
> + (id < last_id) && (i < __riscv_xlen); i++) {
> + ireg |= BIT(i);
> + id++;
> + }
> +
> + /*
> + * The IMSIC EIEx and EIPx registers are indirectly
> + * accessed via using ISELECT and IREG CSRs so we
> + * need to access these CSRs without getting preempted.
> + *
> + * All existing users of this function call this
> + * function with local IRQs disabled so we don't
> + * need to do anything special here.
> + */
> + if (val)
> + imsic_csr_set(isel, ireg);
> + else
> + imsic_csr_clear(isel, ireg);
> + }
> +}
> +
> +void imsic_local_sync(void)
> +{
> + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> + struct imsic_local_config *mlocal;
> + struct imsic_vector *mvec;
> + unsigned long flags;
> + int i;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + for (i = 1; i <= imsic->global.nr_ids; i++) {
> + if (i == IMSIC_IPI_ID)
> + continue;
> +
> + if (test_bit(i, lpriv->ids_enabled_bitmap))
> + __imsic_id_set_enable(i);
> + else
> + __imsic_id_clear_enable(i);
> +
> + mvec = lpriv->ids_move[i];
> + lpriv->ids_move[i] = NULL;
> + if (mvec) {
> + if (__imsic_id_read_clear_pending(i)) {
> + mlocal = per_cpu_ptr(imsic->global.local,
> + mvec->cpu);
> + writel(mvec->local_id, mlocal->msi_va);

Again, do you need all the barriers? If yes, document. No, then relax
the call.

> + }
> +
> + lpriv->vectors[i].hwirq = UINT_MAX;
> + lpriv->vectors[i].order = UINT_MAX;
> + clear_bit(i, lpriv->ids_used_bitmap);
> + }
> +
> + }
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> +}
> +
> +void imsic_local_delivery(bool enable)
> +{
> + if (enable) {
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> + } else {
> + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> + }

My regular "early exit" nit. I guess I really dislike indentation. ;-)

> +}
> +
> +#ifdef CONFIG_SMP
> +static void imsic_remote_sync(unsigned int cpu)
> +{
> + /*
> + * We simply inject ID synchronization IPI to a target CPU
> + * if it is not same as the current CPU. The ipi_send_mask()
> + * implementation of IPI mux will inject ID synchronization
> + * IPI only for CPUs that have enabled it so offline CPUs
> + * won't receive IPI. An offline CPU will unconditionally
> + * synchronize IDs through imsic_starting_cpu() when the
> + * CPU is brought up.
> + */
> + if (cpu_online(cpu)) {
> + if (cpu != smp_processor_id())
> + __ipi_send_mask(imsic->ipi_lsync_desc, cpumask_of(cpu));
> + else
> + imsic_local_sync();
> + }
> +}
> +#else
> +static inline void imsic_remote_sync(unsigned int cpu)

Remove inline.

> +{
> + imsic_local_sync();
> +}
> +#endif
> +
> +void imsic_vector_mask(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv;
> + unsigned long flags;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> +
> + imsic_remote_sync(vec->cpu);

x86 seems to set a timer instead, for the remote cpu cleanup, which can
be much cheaper, and less in instrusive. Is that applicable here?

> +}
> +
> +void imsic_vector_unmask(struct imsic_vector *vec)
> +{
> + struct imsic_local_priv *lpriv;
> + unsigned long flags;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + bitmap_set(lpriv->ids_enabled_bitmap, vec->local_id, 1);
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> +
> + imsic_remote_sync(vec->cpu);
> +}
> +
> +void imsic_vector_move(struct imsic_vector *old_vec,
> + struct imsic_vector *new_vec)
> +{
> + struct imsic_local_priv *old_lpriv, *new_lpriv;
> + struct imsic_vector *ovec, *nvec;
> + unsigned long flags, flags1;
> + unsigned int i;
> +
> + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
> + old_vec->order != new_vec->order ||
> + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
> + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
> + return;
> +
> + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
> + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
> + return;
> +
> + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
> + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
> + return;
> +
> + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
> + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
> +
> + /* Move the state of each vector entry */
> + for (i = 0; i < BIT(old_vec->order); i++) {
> + ovec = old_vec + i;
> + nvec = new_vec + i;
> +
> + /* Unmask the new vector entry */
> + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
> + bitmap_set(new_lpriv->ids_enabled_bitmap,
> + nvec->local_id, 1);
> +
> + /* Mask the old vector entry */
> + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
> +
> + /*
> + * Move and re-trigger the new vector entry based on the
> + * pending state of the old vector entry because we might
> + * get a device interrupt on the old vector entry while
> + * device was being moved to the new vector entry.
> + */
> + old_lpriv->ids_move[ovec->local_id] = nvec;
> + }

Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
does.

Convince me why irq matrix is not a good fit to track the interrupts IDs
*and* get handling/tracking for managed/unmanaged interrupts. You said
that it was the power-of-two blocks for MSI, but can't that be enfored
on matrix alloc? Where are you doing the special handling of MSI?

The reason I'm asking is because I'm pretty certain that x86 has proper
MSI support (Thomas Gleixner can answer for sure! ;-))

IMSIC smells a lot like the the LAPIC. The implementation could probably
be *very* close to what arch/x86/kernel/apic/vector.c does.

Am I completly off here?


Björn

2023-10-25 05:08:42

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Tue, Oct 24, 2023 at 6:35 PM Björn Töpel <[email protected]> wrote:
>
> Hi Anup!
>
> Wow, I'm really happy to see that you're moving towards the 1-1 model!
>
> Anup Patel <[email protected]> writes:
>
> > The RISC-V advanced interrupt architecture (AIA) specification
> > defines a new MSI controller called incoming message signalled
> > interrupt controller (IMSIC) which manages MSI on per-HART (or
> > per-CPU) basis. It also supports IPIs as software injected MSIs.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > Let us add an early irqchip driver for RISC-V IMSIC which sets
> > up the IMSIC state and provide IPIs.
>
> It would help (reviewers, and future bugfixers) if you add (here or in
> the cover) what design decisions you've taken instead of just saying
> that you're now supporting IMSIC.

I agree with the suggestion but this kind of information should be
in the source itself rather than commit description.

>
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Kconfig | 6 +
> > drivers/irqchip/Makefile | 1 +
> > drivers/irqchip/irq-riscv-imsic-early.c | 235 ++++++
> > drivers/irqchip/irq-riscv-imsic-state.c | 962 ++++++++++++++++++++++++
> > drivers/irqchip/irq-riscv-imsic-state.h | 109 +++
> > include/linux/irqchip/riscv-imsic.h | 87 +++
> > 6 files changed, 1400 insertions(+)
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-early.c
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-state.c
> > create mode 100644 drivers/irqchip/irq-riscv-imsic-state.h
> > create mode 100644 include/linux/irqchip/riscv-imsic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index f7149d0f3d45..bdd80716114d 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -546,6 +546,12 @@ config SIFIVE_PLIC
> > select IRQ_DOMAIN_HIERARCHY
> > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_IMSIC
> > + bool
> > + depends on RISCV
> > + select IRQ_DOMAIN_HIERARCHY
> > + select GENERIC_MSI_IRQ
> > +
> > config EXYNOS_IRQ_COMBINER
> > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index ffd945fe71aa..d714724387ce 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
> > diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
> > new file mode 100644
> > index 000000000000..23f689ff5807
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic-early.c
> > @@ -0,0 +1,235 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/module.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +
> > +#include "irq-riscv-imsic-state.h"
> > +
> > +static int imsic_parent_irq;
> > +
> > +#ifdef CONFIG_SMP
> > +static irqreturn_t imsic_local_sync_handler(int irq, void *data)
> > +{
> > + imsic_local_sync();
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static void imsic_ipi_send(unsigned int cpu)
> > +{
> > + struct imsic_local_config *local =
> > + per_cpu_ptr(imsic->global.local, cpu);
>
> General nit for the series; There's a lot line breaks in the series. Try
> to use the full 100 chars for a line.

I prefer the 80 lines limit.

>
> > +
> > + writel(IMSIC_IPI_ID, local->msi_va);
>
> Do you need the barriers here? If so, please document. If not, use the
> _releaxed() version.

We can't assume that _relaxed version of MMIO operations
will work for RISC-V implementation so we conservatively
use regular MMIO operations without _releaxed().

>
> > +}
> > +
> > +static void imsic_ipi_starting_cpu(void)
> > +{
> > + /* Enable IPIs for current CPU. */
> > + __imsic_id_set_enable(IMSIC_IPI_ID);
> > +
> > + /* Enable virtual IPI used for IMSIC ID synchronization */
> > + enable_percpu_irq(imsic->ipi_virq, 0);
>
> Maybe pass IRQ_TYPE_NONE instead of 0, so it's clearer?

Okay, I will update.

>
> > +}
> > +
> > +static void imsic_ipi_dying_cpu(void)
> > +{
> > + /*
> > + * Disable virtual IPI used for IMSIC ID synchronization so
> > + * that we don't receive ID synchronization requests.
> > + */
> > + disable_percpu_irq(imsic->ipi_virq);
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(void)
> > +{
> > + int virq;
> > +
> > + /* Create IMSIC IPI multiplexing */
> > + virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
> > + if (virq <= 0)
> > + return (virq < 0) ? virq : -ENOMEM;
> > + imsic->ipi_virq = virq;
> > +
> > + /* First vIRQ is used for IMSIC ID synchronization */
> > + virq = request_percpu_irq(imsic->ipi_virq, imsic_local_sync_handler,
> > + "riscv-imsic-lsync", imsic->global.local);
> > + if (virq)
> > + return virq;
> > + irq_set_status_flags(imsic->ipi_virq, IRQ_HIDDEN);
> > + imsic->ipi_lsync_desc = irq_to_desc(imsic->ipi_virq);
> > +
> > + /* Set vIRQ range */
> > + riscv_ipi_set_virq_range(imsic->ipi_virq + 1, IMSIC_NR_IPI - 1, true);
> > +
> > + /* Announce that IMSIC is providing IPIs */
> > + pr_info("%pfwP: providing IPIs using interrupt %d\n",
> > + imsic->fwnode, IMSIC_IPI_ID);
> > +
> > + return 0;
> > +}
> > +#else
> > +static void imsic_ipi_starting_cpu(void)
> > +{
> > +}
> > +
> > +static void imsic_ipi_dying_cpu(void)
> > +{
> > +}
> > +
> > +static int __init imsic_ipi_domain_init(void)
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > +/*
> > + * To handle an interrupt, we read the TOPEI CSR and write zero in one
> > + * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
> > + * Linux interrupt number and let Linux IRQ subsystem handle it.
> > + */
> > +static void imsic_handle_irq(struct irq_desc *desc)
> > +{
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + int err, cpu = smp_processor_id();
> > + struct imsic_vector *vec;
> > + unsigned long local_id;
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((local_id = csr_swap(CSR_TOPEI, 0))) {
> > + local_id = local_id >> TOPEI_ID_SHIFT;
> > +
> > + if (local_id == IMSIC_IPI_ID) {
> > +#ifdef CONFIG_SMP
> > + ipi_mux_process();
> > +#endif
> > + continue;
> > + }
> > +
> > + if (unlikely(!imsic->base_domain))
> > + continue;
> > +
> > + vec = imsic_vector_from_local_id(cpu, local_id);
> > + if (!vec) {
> > + pr_warn_ratelimited(
> > + "vector not found for local ID 0x%lx\n",
> > + local_id);
> > + continue;
> > + }
> > +
> > + err = generic_handle_domain_irq(imsic->base_domain,
> > + vec->hwirq);
> > + if (unlikely(err))
> > + pr_warn_ratelimited(
> > + "hwirq 0x%x mapping not found\n",
> > + vec->hwirq);
> > + }
> > +
> > + chained_irq_exit(chip, desc);
> > +}
> > +
> > +static int imsic_starting_cpu(unsigned int cpu)
> > +{
> > + /* Enable per-CPU parent interrupt */
> > + enable_percpu_irq(imsic_parent_irq,
> > + irq_get_trigger_type(imsic_parent_irq));
> > +
> > + /* Setup IPIs */
> > + imsic_ipi_starting_cpu();
> > +
> > + /*
> > + * Interrupts identities might have been enabled/disabled while
> > + * this CPU was not running so sync-up local enable/disable state.
> > + */
> > + imsic_local_sync();
> > +
> > + /* Enable local interrupt delivery */
> > + imsic_local_delivery(true);
> > +
> > + return 0;
> > +}
> > +
> > +static int imsic_dying_cpu(unsigned int cpu)
> > +{
> > + /* Cleanup IPIs */
> > + imsic_ipi_dying_cpu();
> > +
> > + return 0;
> > +}
> > +
> > +static int __init imsic_early_probe(struct fwnode_handle *fwnode)
> > +{
> > + int rc;
> > + struct irq_domain *domain;
> > +
> > + /* Find parent domain and register chained handler */
> > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > + DOMAIN_BUS_ANY);
> > + if (!domain) {
> > + pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
> > + return -ENOENT;
> > + }
> > + imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > + if (!imsic_parent_irq) {
> > + pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
> > + return -ENOENT;
> > + }
> > + irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
> > +
> > + /* Initialize IPI domain */
> > + rc = imsic_ipi_domain_init();
> > + if (rc) {
> > + pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
> > + return rc;
> > + }
> > +
> > + /*
> > + * Setup cpuhp state (must be done after setting imsic_parent_irq)
> > + *
> > + * Don't disable per-CPU IMSIC file when CPU goes offline
> > + * because this affects IPI and the masking/unmasking of
> > + * virtual IPIs is done via generic IPI-Mux
> > + */
> > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > + "irqchip/riscv/imsic:starting",
> > + imsic_starting_cpu, imsic_dying_cpu);
> > +
> > + return 0;
> > +}
> > +
> > +static int __init imsic_early_dt_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + int rc;
> > + struct fwnode_handle *fwnode = &node->fwnode;
> > +
> > + /* Setup IMSIC state */
> > + rc = imsic_setup_state(fwnode);
> > + if (rc) {
> > + pr_err("%pfwP: failed to setup state (error %d)\n",
> > + fwnode, rc);
> > + return rc;
> > + }
> > +
> > + /* Do early setup of IPIs */
> > + rc = imsic_early_probe(fwnode);
> > + if (rc)
> > + return rc;
> > +
> > + /* Ensure that OF platform device gets probed */
> > + of_node_clear_flag(node, OF_POPULATED);
> > + return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_early_dt_init);
> > diff --git a/drivers/irqchip/irq-riscv-imsic-state.c b/drivers/irqchip/irq-riscv-imsic-state.c
> > new file mode 100644
> > index 000000000000..54465e47851c
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-imsic-state.c
> > @@ -0,0 +1,962 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-imsic: " fmt
> > +#include <linux/cpu.h>
> > +#include <linux/bitmap.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/irq.h>
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_irq.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/smp.h>
> > +#include <asm/hwcap.h>
> > +
> > +#include "irq-riscv-imsic-state.h"
> > +
> > +#define IMSIC_DISABLE_EIDELIVERY 0
> > +#define IMSIC_ENABLE_EIDELIVERY 1
> > +#define IMSIC_DISABLE_EITHRESHOLD 1
> > +#define IMSIC_ENABLE_EITHRESHOLD 0
> > +
> > +#define imsic_csr_write(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_write(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +#define imsic_csr_read(__c) \
> > +({ \
> > + unsigned long __v; \
> > + csr_write(CSR_ISELECT, __c); \
> > + __v = csr_read(CSR_IREG); \
> > + __v; \
> > +})
> > +
> > +#define imsic_csr_read_clear(__c, __v) \
> > +({ \
> > + unsigned long __r; \
> > + csr_write(CSR_ISELECT, __c); \
> > + __r = csr_read_clear(CSR_IREG, __v); \
> > + __r; \
> > +})
> > +
> > +#define imsic_csr_set(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_set(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +#define imsic_csr_clear(__c, __v) \
> > +do { \
> > + csr_write(CSR_ISELECT, __c); \
> > + csr_clear(CSR_IREG, __v); \
> > +} while (0)
> > +
> > +struct imsic_priv *imsic;
> > +
> > +const struct imsic_global_config *imsic_get_global_config(void)
> > +{
> > + return (imsic) ? &imsic->global : NULL;
>
> Nit: No need for the parenthesis.

Okay, I will update.

>
> > +}
> > +EXPORT_SYMBOL_GPL(imsic_get_global_config);
> > +
> > +static bool __imsic_eix_read_clear(unsigned long id, bool pend)
> > +{
> > + unsigned long isel, imask;
> > +
> > + isel = id / BITS_PER_LONG;
> > + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
> > + imask = BIT(id & (__riscv_xlen - 1));
> > +
> > + return (imsic_csr_read_clear(isel, imask) & imask) ? true : false;
> > +}
> > +
> > +#define __imsic_id_read_clear_enabled(__id) \
> > + __imsic_eix_read_clear((__id), false)
> > +#define __imsic_id_read_clear_pending(__id) \
> > + __imsic_eix_read_clear((__id), true)
> > +
> > +void __imsic_eix_update(unsigned long base_id,
> > + unsigned long num_id, bool pend, bool val)
> > +{
> > + unsigned long i, isel, ireg;
> > + unsigned long id = base_id, last_id = base_id + num_id;
> > +
> > + while (id < last_id) {
> > + isel = id / BITS_PER_LONG;
> > + isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
> > + isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
>
> Parenthesis nit.

Okay, I will update.

>
> > +
> > + ireg = 0;
> > + for (i = id & (__riscv_xlen - 1);
> > + (id < last_id) && (i < __riscv_xlen); i++) {
> > + ireg |= BIT(i);
> > + id++;
> > + }
> > +
> > + /*
> > + * The IMSIC EIEx and EIPx registers are indirectly
> > + * accessed via using ISELECT and IREG CSRs so we
> > + * need to access these CSRs without getting preempted.
> > + *
> > + * All existing users of this function call this
> > + * function with local IRQs disabled so we don't
> > + * need to do anything special here.
> > + */
> > + if (val)
> > + imsic_csr_set(isel, ireg);
> > + else
> > + imsic_csr_clear(isel, ireg);
> > + }
> > +}
> > +
> > +void imsic_local_sync(void)
> > +{
> > + struct imsic_local_priv *lpriv = this_cpu_ptr(imsic->lpriv);
> > + struct imsic_local_config *mlocal;
> > + struct imsic_vector *mvec;
> > + unsigned long flags;
> > + int i;
> > +
> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> > + for (i = 1; i <= imsic->global.nr_ids; i++) {
> > + if (i == IMSIC_IPI_ID)
> > + continue;
> > +
> > + if (test_bit(i, lpriv->ids_enabled_bitmap))
> > + __imsic_id_set_enable(i);
> > + else
> > + __imsic_id_clear_enable(i);
> > +
> > + mvec = lpriv->ids_move[i];
> > + lpriv->ids_move[i] = NULL;
> > + if (mvec) {
> > + if (__imsic_id_read_clear_pending(i)) {
> > + mlocal = per_cpu_ptr(imsic->global.local,
> > + mvec->cpu);
> > + writel(mvec->local_id, mlocal->msi_va);
>
> Again, do you need all the barriers? If yes, document. No, then relax
> the call.

Same comment as above.

>
> > + }
> > +
> > + lpriv->vectors[i].hwirq = UINT_MAX;
> > + lpriv->vectors[i].order = UINT_MAX;
> > + clear_bit(i, lpriv->ids_used_bitmap);
> > + }
> > +
> > + }
> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> > +}
> > +
> > +void imsic_local_delivery(bool enable)
> > +{
> > + if (enable) {
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> > + } else {
> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> > + }
>
> My regular "early exit" nit. I guess I really dislike indentation. ;-)

-ENOPARSE

>
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static void imsic_remote_sync(unsigned int cpu)
> > +{
> > + /*
> > + * We simply inject ID synchronization IPI to a target CPU
> > + * if it is not same as the current CPU. The ipi_send_mask()
> > + * implementation of IPI mux will inject ID synchronization
> > + * IPI only for CPUs that have enabled it so offline CPUs
> > + * won't receive IPI. An offline CPU will unconditionally
> > + * synchronize IDs through imsic_starting_cpu() when the
> > + * CPU is brought up.
> > + */
> > + if (cpu_online(cpu)) {
> > + if (cpu != smp_processor_id())
> > + __ipi_send_mask(imsic->ipi_lsync_desc, cpumask_of(cpu));
> > + else
> > + imsic_local_sync();
> > + }
> > +}
> > +#else
> > +static inline void imsic_remote_sync(unsigned int cpu)
>
> Remove inline.

Okay, I will update.

>
> > +{
> > + imsic_local_sync();
> > +}
> > +#endif
> > +
> > +void imsic_vector_mask(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv;
> > + unsigned long flags;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> > +
> > + imsic_remote_sync(vec->cpu);
>
> x86 seems to set a timer instead, for the remote cpu cleanup, which can
> be much cheaper, and less in instrusive. Is that applicable here?

The issue with that approach is deciding the right duration
of timer interrupt. There might be platforms who need
immediate mask/unmask response. We can certainely
keep improving/tuning this over-time.

>
> > +}
> > +
> > +void imsic_vector_unmask(struct imsic_vector *vec)
> > +{
> > + struct imsic_local_priv *lpriv;
> > + unsigned long flags;
> > +
> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> > + return;
> > +
> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> > + bitmap_set(lpriv->ids_enabled_bitmap, vec->local_id, 1);
> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> > +
> > + imsic_remote_sync(vec->cpu);
> > +}
> > +
> > +void imsic_vector_move(struct imsic_vector *old_vec,
> > + struct imsic_vector *new_vec)
> > +{
> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
> > + struct imsic_vector *ovec, *nvec;
> > + unsigned long flags, flags1;
> > + unsigned int i;
> > +
> > + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
> > + old_vec->order != new_vec->order ||
> > + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
> > + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
> > + return;
> > +
> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
> > + return;
> > +
> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
> > + return;
> > +
> > + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
> > + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
> > +
> > + /* Move the state of each vector entry */
> > + for (i = 0; i < BIT(old_vec->order); i++) {
> > + ovec = old_vec + i;
> > + nvec = new_vec + i;
> > +
> > + /* Unmask the new vector entry */
> > + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
> > + bitmap_set(new_lpriv->ids_enabled_bitmap,
> > + nvec->local_id, 1);
> > +
> > + /* Mask the old vector entry */
> > + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
> > +
> > + /*
> > + * Move and re-trigger the new vector entry based on the
> > + * pending state of the old vector entry because we might
> > + * get a device interrupt on the old vector entry while
> > + * device was being moved to the new vector entry.
> > + */
> > + old_lpriv->ids_move[ovec->local_id] = nvec;
> > + }
>
> Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
> does.
>
> Convince me why irq matrix is not a good fit to track the interrupts IDs
> *and* get handling/tracking for managed/unmanaged interrupts. You said
> that it was the power-of-two blocks for MSI, but can't that be enfored
> on matrix alloc? Where are you doing the special handling of MSI?
>
> The reason I'm asking is because I'm pretty certain that x86 has proper
> MSI support (Thomas Gleixner can answer for sure! ;-))
>
> IMSIC smells a lot like the the LAPIC. The implementation could probably
> be *very* close to what arch/x86/kernel/apic/vector.c does.
>
> Am I completly off here?
>

The x86 APIC driver only supports MSI-X due to which the IRQ matrix
allocator only supports ID/Vector allocation suitable for MSI-X whereas
the ARM GICv3 driver supports both legacy MSI and MSI-X. In absence
of legacy MSI support, Linux x86 will fallback to INTx for PCI devices
with legacy MSI support but for RISC-V platforms we can't assume that
INTx is available because we might be dealing with an IMSIC-only
platform.

Refer, x86_vector_msi_parent_ops in arch/x86/kernel/apic/msi.c and
X86_VECTOR_MSI_FLAGS_SUPPORTED in arch/x86/include/asm/msi.h

Refer, its_pci_msi_domain_info in drivers/irqchip/irq-gic-v3-its-pci-msi.c

The changes which I think are need in the IRQ matrix allocator before
integrating it in the IMSIC driver are the following:
1) IRQ matrix allocator assumed NR_VECTORS to be a fixed define
which the arch code provides but in RISC-V world the number of
IDs are discovered from DT or ACPI.
2) IRQ matrix allocator needs to be support allocator multiple vectors
in power-of-2 which will allow IMSIC driver to support both legacy
MSI and MSI-X. This will involve changing the way best CPU is
found, the way bitmap APIs are used and adding some new APIs
for allocate vectors in power-of-2

Based on above, I suggest we keep the integration of IRQ matrix
allocator in the IMSIC driver as a separate series which will allow
us to unblock other series (such as AIA ACPI support, power
managment related changes in AIA drivers, etc).

Regards,
Anup

2023-10-25 05:08:56

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

On Tue, Oct 24, 2023 at 6:39 PM Björn Töpel <[email protected]> wrote:
>
> Anup Patel <[email protected]> writes:
>
> > The Linux PCI framework requires it's own dedicated MSI irqdomain so
> > let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Kconfig | 7 +++
> > drivers/irqchip/irq-riscv-imsic-platform.c | 51 ++++++++++++++++++++++
> > drivers/irqchip/irq-riscv-imsic-state.h | 1 +
> > 3 files changed, 59 insertions(+)
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index bdd80716114d..c1d69b418dfb 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -552,6 +552,13 @@ config RISCV_IMSIC
> > select IRQ_DOMAIN_HIERARCHY
> > select GENERIC_MSI_IRQ
> >
> > +config RISCV_IMSIC_PCI
> > + bool
> > + depends on RISCV_IMSIC
> > + depends on PCI
> > + depends on PCI_MSI
> > + default RISCV_IMSIC
> > +
> > config EXYNOS_IRQ_COMBINER
> > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
> > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
> > diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
> > index 23d286cb017e..cdb659401199 100644
> > --- a/drivers/irqchip/irq-riscv-imsic-platform.c
> > +++ b/drivers/irqchip/irq-riscv-imsic-platform.c
> > @@ -13,6 +13,7 @@
> > #include <linux/irqdomain.h>
> > #include <linux/module.h>
> > #include <linux/msi.h>
> > +#include <linux/pci.h>
> > #include <linux/platform_device.h>
> > #include <linux/spinlock.h>
> > #include <linux/smp.h>
> > @@ -215,6 +216,42 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
> > #endif
> > };
> >
> > +#ifdef CONFIG_RISCV_IMSIC_PCI
> > +
> > +static void imsic_pci_mask_irq(struct irq_data *d)
> > +{
> > + pci_msi_mask_irq(d);
> > + irq_chip_mask_parent(d);
>
> I've asked this before, but I still don't get why you need to propagate
> to the parent? Why isn't masking on PCI enough?
>

We are using hierarchical IRQ domains where IMSIC-BASE is
the root domain whereas IMSIC-PLAT domain (MSI irq domain
for platform devices) and IMSIC-PCI domain (MSI irq domain
for PCI devices). For hierarchical IRQ domains, if irq domain X
does not implement irq_mask/unmask then the parent irq
domain irq_mask/unmask is called with parent irq descriptor.

Now for IMSIC-PCI domain, the PCI framework expects the
pci_msi_mask/unmask_irq() functions to be called but if
we directly point pci_msi_mask/unmask_irq() in the IMSIC-PCI
irqchip then IMSIC-BASE (parent domain) irq_mask/umask
won't be called hence the IRQ won't be masked/unmask.
Due to this, we call both pci_msi_mask/unmask_irq() and
irq_chip_mask/unmask_parent() for IMSIC-PCI domain.

The ARM GIC driver also uses hierarchical IRQ domains
even there same thing is done.
(Refer, first 30 lines of drivers/irqchip/irq-gic-v3-its-pci-msi.c)

Regards,
Anup

2023-10-25 08:56:27

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

Anup Patel <[email protected]> writes:

> On Tue, Oct 24, 2023 at 6:39 PM Björn Töpel <[email protected]> wrote:
>>
>> Anup Patel <[email protected]> writes:
>>
>> > The Linux PCI framework requires it's own dedicated MSI irqdomain so
>> > let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.
>> >
>> > Signed-off-by: Anup Patel <[email protected]>
>> > ---
>> > drivers/irqchip/Kconfig | 7 +++
>> > drivers/irqchip/irq-riscv-imsic-platform.c | 51 ++++++++++++++++++++++
>> > drivers/irqchip/irq-riscv-imsic-state.h | 1 +
>> > 3 files changed, 59 insertions(+)
>> >
>> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
>> > index bdd80716114d..c1d69b418dfb 100644
>> > --- a/drivers/irqchip/Kconfig
>> > +++ b/drivers/irqchip/Kconfig
>> > @@ -552,6 +552,13 @@ config RISCV_IMSIC
>> > select IRQ_DOMAIN_HIERARCHY
>> > select GENERIC_MSI_IRQ
>> >
>> > +config RISCV_IMSIC_PCI
>> > + bool
>> > + depends on RISCV_IMSIC
>> > + depends on PCI
>> > + depends on PCI_MSI
>> > + default RISCV_IMSIC
>> > +
>> > config EXYNOS_IRQ_COMBINER
>> > bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
>> > depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
>> > diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c
>> > index 23d286cb017e..cdb659401199 100644
>> > --- a/drivers/irqchip/irq-riscv-imsic-platform.c
>> > +++ b/drivers/irqchip/irq-riscv-imsic-platform.c
>> > @@ -13,6 +13,7 @@
>> > #include <linux/irqdomain.h>
>> > #include <linux/module.h>
>> > #include <linux/msi.h>
>> > +#include <linux/pci.h>
>> > #include <linux/platform_device.h>
>> > #include <linux/spinlock.h>
>> > #include <linux/smp.h>
>> > @@ -215,6 +216,42 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
>> > #endif
>> > };
>> >
>> > +#ifdef CONFIG_RISCV_IMSIC_PCI
>> > +
>> > +static void imsic_pci_mask_irq(struct irq_data *d)
>> > +{
>> > + pci_msi_mask_irq(d);
>> > + irq_chip_mask_parent(d);
>>
>> I've asked this before, but I still don't get why you need to propagate
>> to the parent? Why isn't masking on PCI enough?
>>
>
> We are using hierarchical IRQ domains where IMSIC-BASE is
> the root domain whereas IMSIC-PLAT domain (MSI irq domain
> for platform devices) and IMSIC-PCI domain (MSI irq domain
> for PCI devices). For hierarchical IRQ domains, if irq domain X
> does not implement irq_mask/unmask then the parent irq
> domain irq_mask/unmask is called with parent irq descriptor.
>
> Now for IMSIC-PCI domain, the PCI framework expects the
> pci_msi_mask/unmask_irq() functions to be called but if
> we directly point pci_msi_mask/unmask_irq() in the IMSIC-PCI
> irqchip then IMSIC-BASE (parent domain) irq_mask/umask
> won't be called hence the IRQ won't be masked/unmask.
> Due to this, we call both pci_msi_mask/unmask_irq() and
> irq_chip_mask/unmask_parent() for IMSIC-PCI domain.

Ok. I wont dig more into it for now! If the interrupt is disabled at
PCI, it seems a bit overkill to *also* mask it at the IMSIC level...


Björn

2023-10-25 16:07:25

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

Hi!

Anup Patel <[email protected]> writes:

> On Tue, Oct 24, 2023 at 6:35 PM Björn Töpel <[email protected]> wrote:
>>
>> Hi Anup!
>>
>> Wow, I'm really happy to see that you're moving towards the 1-1 model!
>>
>> Anup Patel <[email protected]> writes:
>>
>> > The RISC-V advanced interrupt architecture (AIA) specification
>> > defines a new MSI controller called incoming message signalled
>> > interrupt controller (IMSIC) which manages MSI on per-HART (or
>> > per-CPU) basis. It also supports IPIs as software injected MSIs.
>> > (For more details refer https://github.com/riscv/riscv-aia)
>> >
>> > Let us add an early irqchip driver for RISC-V IMSIC which sets
>> > up the IMSIC state and provide IPIs.
>>
>> It would help (reviewers, and future bugfixers) if you add (here or in
>> the cover) what design decisions you've taken instead of just saying
>> that you're now supporting IMSIC.
>
> I agree with the suggestion but this kind of information should be
> in the source itself rather than commit description.

I think the high-level flow, and why you made certain design decisions
should be in the commit message.

The "how" in the code, the "why" in the commit message. Regardless -- it
would make it easier for reviewers to get into your code faster.

[...]

>> > +
>> > + writel(IMSIC_IPI_ID, local->msi_va);
>>
>> Do you need the barriers here? If so, please document. If not, use the
>> _releaxed() version.
>
> We can't assume that _relaxed version of MMIO operations
> will work for RISC-V implementation so we conservatively
> use regular MMIO operations without _releaxed().

You'll need to expand on your thinking here, Anup. We can't just
sprinkle fences everywhere because of "we can't assume it'll work". Do
you need proper barriers for IPIs or not?

[...]

>> > + mvec = lpriv->ids_move[i];
>> > + lpriv->ids_move[i] = NULL;
>> > + if (mvec) {
>> > + if (__imsic_id_read_clear_pending(i)) {
>> > + mlocal = per_cpu_ptr(imsic->global.local,
>> > + mvec->cpu);
>> > + writel(mvec->local_id, mlocal->msi_va);
>>
>> Again, do you need all the barriers? If yes, document. No, then relax
>> the call.
>
> Same comment as above.

Dito for me! ;-)

>> > + }
>> > +
>> > + lpriv->vectors[i].hwirq = UINT_MAX;
>> > + lpriv->vectors[i].order = UINT_MAX;
>> > + clear_bit(i, lpriv->ids_used_bitmap);
>> > + }
>> > +
>> > + }
>> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
>> > +}
>> > +
>> > +void imsic_local_delivery(bool enable)
>> > +{
>> > + if (enable) {
>> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
>> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
>> > + } else {
>> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
>> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
>> > + }
>>
>> My regular "early exit" nit. I guess I really dislike indentation. ;-)
>
> -ENOPARSE

if (...) {
a();
b();
c();
} else {
d();
e();
}

vs

if (...) {
a();
b();
c();
return;
}

d();
e();

[...]

>> > +void imsic_vector_mask(struct imsic_vector *vec)
>> > +{
>> > + struct imsic_local_priv *lpriv;
>> > + unsigned long flags;
>> > +
>> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
>> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
>> > + return;
>> > +
>> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
>> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
>> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
>> > +
>> > + imsic_remote_sync(vec->cpu);
>>
>> x86 seems to set a timer instead, for the remote cpu cleanup, which can
>> be much cheaper, and less in instrusive. Is that applicable here?
>
> The issue with that approach is deciding the right duration
> of timer interrupt. There might be platforms who need
> immediate mask/unmask response. We can certainely
> keep improving/tuning this over-time.

Any concrete examples where this is an actual problem?

[...]

>> > +void imsic_vector_move(struct imsic_vector *old_vec,
>> > + struct imsic_vector *new_vec)
>> > +{
>> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
>> > + struct imsic_vector *ovec, *nvec;
>> > + unsigned long flags, flags1;
>> > + unsigned int i;
>> > +
>> > + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
>> > + old_vec->order != new_vec->order ||
>> > + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
>> > + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
>> > + return;
>> > +
>> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
>> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
>> > + return;
>> > +
>> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
>> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
>> > + return;
>> > +
>> > + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
>> > + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
>> > +
>> > + /* Move the state of each vector entry */
>> > + for (i = 0; i < BIT(old_vec->order); i++) {
>> > + ovec = old_vec + i;
>> > + nvec = new_vec + i;
>> > +
>> > + /* Unmask the new vector entry */
>> > + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
>> > + bitmap_set(new_lpriv->ids_enabled_bitmap,
>> > + nvec->local_id, 1);
>> > +
>> > + /* Mask the old vector entry */
>> > + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
>> > +
>> > + /*
>> > + * Move and re-trigger the new vector entry based on the
>> > + * pending state of the old vector entry because we might
>> > + * get a device interrupt on the old vector entry while
>> > + * device was being moved to the new vector entry.
>> > + */
>> > + old_lpriv->ids_move[ovec->local_id] = nvec;
>> > + }
>>
>> Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
>> does.
>>
>> Convince me why irq matrix is not a good fit to track the interrupts IDs
>> *and* get handling/tracking for managed/unmanaged interrupts. You said
>> that it was the power-of-two blocks for MSI, but can't that be enfored
>> on matrix alloc? Where are you doing the special handling of MSI?
>>
>> The reason I'm asking is because I'm pretty certain that x86 has proper
>> MSI support (Thomas Gleixner can answer for sure! ;-))
>>
>> IMSIC smells a lot like the the LAPIC. The implementation could probably
>> be *very* close to what arch/x86/kernel/apic/vector.c does.
>>
>> Am I completly off here?
>>
>
> The x86 APIC driver only supports MSI-X due to which the IRQ matrix
> allocator only supports ID/Vector allocation suitable for MSI-X whereas
> the ARM GICv3 driver supports both legacy MSI and MSI-X. In absence
> of legacy MSI support, Linux x86 will fallback to INTx for PCI devices
> with legacy MSI support but for RISC-V platforms we can't assume that
> INTx is available because we might be dealing with an IMSIC-only
> platform.

You're mixing up MSI and *multi-MSI* (multiple MSI vectors).

x86 support MSI-X, MSI, and multi-MSI with IOMMU.

Gleixner has a some insights on why one probably should *not* jump
through hoops to support multi-MSI:
https://lore.kernel.org/all/877d7yhve7.ffs@tglx/

Will we really see HW requiring multi-MSI support on RISC-V systems
without IOMMU? To me this sounds like a theoretical exercise.

> Refer, x86_vector_msi_parent_ops in arch/x86/kernel/apic/msi.c and
> X86_VECTOR_MSI_FLAGS_SUPPORTED in arch/x86/include/asm/msi.h
>
> Refer, its_pci_msi_domain_info in drivers/irqchip/irq-gic-v3-its-pci-msi.c
>
> The changes which I think are need in the IRQ matrix allocator before
> integrating it in the IMSIC driver are the following:
> 1) IRQ matrix allocator assumed NR_VECTORS to be a fixed define
> which the arch code provides but in RISC-V world the number of
> IDs are discovered from DT or ACPI.

Ok, let's try to be bit more explicit. Have you had a look at
kernel/irq/matrix.c?

You need to define the IRQ_MATRIX_BITS (which x86 sets to NR_VECTORS).
This is the size of the bitmap. For IMSIC this would be 2047.

The matrix allocator is an excellent fit, modulo multi-MSI. It's battle
proven code.

> 2) IRQ matrix allocator needs to be support allocator multiple vectors
> in power-of-2 which will allow IMSIC driver to support both legacy
> MSI and MSI-X. This will involve changing the way best CPU is
> found, the way bitmap APIs are used and adding some new APIs
> for allocate vectors in power-of-2

...and all the other things multi-MSI requires.

> Based on above, I suggest we keep the integration of IRQ matrix
> allocator in the IMSIC driver as a separate series which will allow
> us to unblock other series (such as AIA ACPI support, power
> managment related changes in AIA drivers, etc).

I suggest removing the multi-MSI support, and use the matrix allocator.
We have something that looks like what x86 has (IMSIC). We have a
battle-proven implementation, and helper function. In my view it would
be just weird not to piggy-back on that work, and benefit from years of
bugfixes/things we haven't thought of.

Finally; I don't see that you're handling managed interrupt in the
series (Oh, the matrix allocator has support for that! ;-)).

I realize it's some changes, but the interrupt handling is a central
piece.

If you agree with my input, LMK if you're time/work-constrained, and I
can take a stab at integrating it in the series.


Björn

2023-10-25 17:26:41

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Wed, Oct 25, 2023 at 9:35 PM Björn Töpel <[email protected]> wrote:
>
> Hi!
>
> Anup Patel <[email protected]> writes:
>
> > On Tue, Oct 24, 2023 at 6:35 PM Björn Töpel <[email protected]> wrote:
> >>
> >> Hi Anup!
> >>
> >> Wow, I'm really happy to see that you're moving towards the 1-1 model!
> >>
> >> Anup Patel <[email protected]> writes:
> >>
> >> > The RISC-V advanced interrupt architecture (AIA) specification
> >> > defines a new MSI controller called incoming message signalled
> >> > interrupt controller (IMSIC) which manages MSI on per-HART (or
> >> > per-CPU) basis. It also supports IPIs as software injected MSIs.
> >> > (For more details refer https://github.com/riscv/riscv-aia)
> >> >
> >> > Let us add an early irqchip driver for RISC-V IMSIC which sets
> >> > up the IMSIC state and provide IPIs.
> >>
> >> It would help (reviewers, and future bugfixers) if you add (here or in
> >> the cover) what design decisions you've taken instead of just saying
> >> that you're now supporting IMSIC.
> >
> > I agree with the suggestion but this kind of information should be
> > in the source itself rather than commit description.
>
> I think the high-level flow, and why you made certain design decisions
> should be in the commit message.
>
> The "how" in the code, the "why" in the commit message. Regardless -- it
> would make it easier for reviewers to get into your code faster.
>
> [...]
>
> >> > +
> >> > + writel(IMSIC_IPI_ID, local->msi_va);
> >>
> >> Do you need the barriers here? If so, please document. If not, use the
> >> _releaxed() version.
> >
> > We can't assume that _relaxed version of MMIO operations
> > will work for RISC-V implementation so we conservatively
> > use regular MMIO operations without _releaxed().
>
> You'll need to expand on your thinking here, Anup. We can't just
> sprinkle fences everywhere because of "we can't assume it'll work". Do
> you need proper barriers for IPIs or not?

For IPIs, we use generic IPI-mux which has its own barriers. We
certainly need matching read and write barrier for the data being
passed for synchronization.

>
> [...]
>
> >> > + mvec = lpriv->ids_move[i];
> >> > + lpriv->ids_move[i] = NULL;
> >> > + if (mvec) {
> >> > + if (__imsic_id_read_clear_pending(i)) {
> >> > + mlocal = per_cpu_ptr(imsic->global.local,
> >> > + mvec->cpu);
> >> > + writel(mvec->local_id, mlocal->msi_va);
> >>
> >> Again, do you need all the barriers? If yes, document. No, then relax
> >> the call.
> >
> > Same comment as above.
>
> Dito for me! ;-)
>
> >> > + }
> >> > +
> >> > + lpriv->vectors[i].hwirq = UINT_MAX;
> >> > + lpriv->vectors[i].order = UINT_MAX;
> >> > + clear_bit(i, lpriv->ids_used_bitmap);
> >> > + }
> >> > +
> >> > + }
> >> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> >> > +}
> >> > +
> >> > +void imsic_local_delivery(bool enable)
> >> > +{
> >> > + if (enable) {
> >> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
> >> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
> >> > + } else {
> >> > + imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
> >> > + imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
> >> > + }
> >>
> >> My regular "early exit" nit. I guess I really dislike indentation. ;-)
> >
> > -ENOPARSE
>
> if (...) {
> a();
> b();
> c();
> } else {
> d();
> e();
> }
>
> vs
>
> if (...) {
> a();
> b();
> c();
> return;
> }
>
> d();
> e();
>
> [...]
>
> >> > +void imsic_vector_mask(struct imsic_vector *vec)
> >> > +{
> >> > + struct imsic_local_priv *lpriv;
> >> > + unsigned long flags;
> >> > +
> >> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> >> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> >> > + return;
> >> > +
> >> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> >> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
> >> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> >> > +
> >> > + imsic_remote_sync(vec->cpu);
> >>
> >> x86 seems to set a timer instead, for the remote cpu cleanup, which can
> >> be much cheaper, and less in instrusive. Is that applicable here?
> >
> > The issue with that approach is deciding the right duration
> > of timer interrupt. There might be platforms who need
> > immediate mask/unmask response. We can certainely
> > keep improving/tuning this over-time.
>
> Any concrete examples where this is an actual problem?

Do you have a concrete timer duration with proper justification ?

>
> [...]
>
> >> > +void imsic_vector_move(struct imsic_vector *old_vec,
> >> > + struct imsic_vector *new_vec)
> >> > +{
> >> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
> >> > + struct imsic_vector *ovec, *nvec;
> >> > + unsigned long flags, flags1;
> >> > + unsigned int i;
> >> > +
> >> > + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
> >> > + old_vec->order != new_vec->order ||
> >> > + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
> >> > + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
> >> > + return;
> >> > +
> >> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
> >> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
> >> > + return;
> >> > +
> >> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
> >> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
> >> > + return;
> >> > +
> >> > + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
> >> > + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
> >> > +
> >> > + /* Move the state of each vector entry */
> >> > + for (i = 0; i < BIT(old_vec->order); i++) {
> >> > + ovec = old_vec + i;
> >> > + nvec = new_vec + i;
> >> > +
> >> > + /* Unmask the new vector entry */
> >> > + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
> >> > + bitmap_set(new_lpriv->ids_enabled_bitmap,
> >> > + nvec->local_id, 1);
> >> > +
> >> > + /* Mask the old vector entry */
> >> > + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
> >> > +
> >> > + /*
> >> > + * Move and re-trigger the new vector entry based on the
> >> > + * pending state of the old vector entry because we might
> >> > + * get a device interrupt on the old vector entry while
> >> > + * device was being moved to the new vector entry.
> >> > + */
> >> > + old_lpriv->ids_move[ovec->local_id] = nvec;
> >> > + }
> >>
> >> Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
> >> does.
> >>
> >> Convince me why irq matrix is not a good fit to track the interrupts IDs
> >> *and* get handling/tracking for managed/unmanaged interrupts. You said
> >> that it was the power-of-two blocks for MSI, but can't that be enfored
> >> on matrix alloc? Where are you doing the special handling of MSI?
> >>
> >> The reason I'm asking is because I'm pretty certain that x86 has proper
> >> MSI support (Thomas Gleixner can answer for sure! ;-))
> >>
> >> IMSIC smells a lot like the the LAPIC. The implementation could probably
> >> be *very* close to what arch/x86/kernel/apic/vector.c does.
> >>
> >> Am I completly off here?
> >>
> >
> > The x86 APIC driver only supports MSI-X due to which the IRQ matrix
> > allocator only supports ID/Vector allocation suitable for MSI-X whereas
> > the ARM GICv3 driver supports both legacy MSI and MSI-X. In absence
> > of legacy MSI support, Linux x86 will fallback to INTx for PCI devices
> > with legacy MSI support but for RISC-V platforms we can't assume that
> > INTx is available because we might be dealing with an IMSIC-only
> > platform.
>
> You're mixing up MSI and *multi-MSI* (multiple MSI vectors).

So now you are doubting my understanding of MSI ?

>
> x86 support MSI-X, MSI, and multi-MSI with IOMMU.
>
> Gleixner has a some insights on why one probably should *not* jump
> through hoops to support multi-MSI:
> https://lore.kernel.org/all/877d7yhve7.ffs@tglx/

This is a fair justification to drop why x86 does not support
the legacy-MSI or "multi-MSI".

>
> Will we really see HW requiring multi-MSI support on RISC-V systems
> without IOMMU? To me this sounds like a theoretical exercise.
>
> > Refer, x86_vector_msi_parent_ops in arch/x86/kernel/apic/msi.c and
> > X86_VECTOR_MSI_FLAGS_SUPPORTED in arch/x86/include/asm/msi.h
> >
> > Refer, its_pci_msi_domain_info in drivers/irqchip/irq-gic-v3-its-pci-msi.c
> >
> > The changes which I think are need in the IRQ matrix allocator before
> > integrating it in the IMSIC driver are the following:
> > 1) IRQ matrix allocator assumed NR_VECTORS to be a fixed define
> > which the arch code provides but in RISC-V world the number of
> > IDs are discovered from DT or ACPI.
>
> Ok, let's try to be bit more explicit. Have you had a look at
> kernel/irq/matrix.c?

Why do you doubt it ?

>
> You need to define the IRQ_MATRIX_BITS (which x86 sets to NR_VECTORS).
> This is the size of the bitmap. For IMSIC this would be 2047.

Wow, let's just create large bitmaps even when underlying HW has
fewer per-CPU IDs !!!

>
> The matrix allocator is an excellent fit, modulo multi-MSI. It's battle
> proven code.
>
> > 2) IRQ matrix allocator needs to be support allocator multiple vectors
> > in power-of-2 which will allow IMSIC driver to support both legacy
> > MSI and MSI-X. This will involve changing the way best CPU is
> > found, the way bitmap APIs are used and adding some new APIs
> > for allocate vectors in power-of-2
>
> ...and all the other things multi-MSI requires.
>
> > Based on above, I suggest we keep the integration of IRQ matrix
> > allocator in the IMSIC driver as a separate series which will allow
> > us to unblock other series (such as AIA ACPI support, power
> > managment related changes in AIA drivers, etc).
>
> I suggest removing the multi-MSI support, and use the matrix allocator.
> We have something that looks like what x86 has (IMSIC). We have a
> battle-proven implementation, and helper function. In my view it would
> be just weird not to piggy-back on that work, and benefit from years of
> bugfixes/things we haven't thought of.
>
> Finally; I don't see that you're handling managed interrupt in the
> series (Oh, the matrix allocator has support for that! ;-)).

We don't need managed interrupts like x86 does. We are using
IPI-mux to create multiple virtual IPIs on-top-of single ID and we
use some of these virtual IPIs for internal managment.

>
> I realize it's some changes, but the interrupt handling is a central
> piece.
>
> If you agree with my input, LMK if you're time/work-constrained, and I
> can take a stab at integrating it in the series.
>
>
> Björn

Regards,
Anup

2023-10-25 19:56:59

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v11 08/14] irqchip/riscv-imsic: Add support for platform MSI irqdomain

On Mon, Oct 23 2023 at 22:57, Anup Patel wrote:
> The Linux platform MSI support requires a platform MSI irqdomain so
> let us add a platform irqchip driver for RISC-V IMSIC which provides
> a base IRQ domain and platform MSI domain. This driver assumes that
> the IMSIC state is already initialized by the IMSIC early driver.

Please no. The platform MSI cruft is really a horrible concept and there
is ongoing work (sadly mightily delayed) to convert the main (ab)user
ARM over to per device MSI domains.

https://lore.kernel.org/r/[email protected]

2023-10-25 19:59:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

On Mon, Oct 23 2023 at 22:57, Anup Patel wrote:
> The Linux PCI framework requires it's own dedicated MSI irqdomain so
> let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.

Same here. Please don't add new incarnations of that and switch over to
per device MSI domains which is the most future proof mechanism.

Thanks,

tglx

2023-10-26 08:51:32

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

Hi Anup,

I'm getting the vibes that you are upset. Just to clarify; I want AIA
support as much as the next guy. I'm not here to pick fights, and argue
for non-technical things. I'm just here for
questions/clarifications/suggestion, so we can move the implementation
forward.

If I for some reason offended you, please let me know. If that was the
case, that was not on purpose, and accept my apologies.

Now, please let's continue the technical discussion.

Anup Patel <[email protected]> writes:

>> >> > +
>> >> > + writel(IMSIC_IPI_ID, local->msi_va);
>> >>
>> >> Do you need the barriers here? If so, please document. If not, use the
>> >> _releaxed() version.
>> >
>> > We can't assume that _relaxed version of MMIO operations
>> > will work for RISC-V implementation so we conservatively
>> > use regular MMIO operations without _releaxed().
>>
>> You'll need to expand on your thinking here, Anup. We can't just
>> sprinkle fences everywhere because of "we can't assume it'll work". Do
>> you need proper barriers for IPIs or not?
>
> For IPIs, we use generic IPI-mux which has its own barriers. We
> certainly need matching read and write barrier for the data being
> passed for synchronization.

Ok! If the IPI-mux has the barriers, it seems like a writel_relaxed will
do just fine.

>> >> > +void imsic_vector_mask(struct imsic_vector *vec)
>> >> > +{
>> >> > + struct imsic_local_priv *lpriv;
>> >> > + unsigned long flags;
>> >> > +
>> >> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
>> >> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
>> >> > + return;
>> >> > +
>> >> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
>> >> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
>> >> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
>> >> > +
>> >> > + imsic_remote_sync(vec->cpu);
>> >>
>> >> x86 seems to set a timer instead, for the remote cpu cleanup, which can
>> >> be much cheaper, and less in instrusive. Is that applicable here?
>> >
>> > The issue with that approach is deciding the right duration
>> > of timer interrupt. There might be platforms who need
>> > immediate mask/unmask response. We can certainely
>> > keep improving/tuning this over-time.
>>
>> Any concrete examples where this is an actual problem?
>
> Do you have a concrete timer duration with proper justification ?

I would simply mimic what x86 does for now -- jiffies + 1.

No biggie for me, and this can, as you say, be improved later.

>> >> > +void imsic_vector_move(struct imsic_vector *old_vec,
>> >> > + struct imsic_vector *new_vec)
>> >> > +{
>> >> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
>> >> > + struct imsic_vector *ovec, *nvec;
>> >> > + unsigned long flags, flags1;
>> >> > + unsigned int i;
>> >> > +
>> >> > + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
>> >> > + old_vec->order != new_vec->order ||
>> >> > + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
>> >> > + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
>> >> > + return;
>> >> > +
>> >> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
>> >> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
>> >> > + return;
>> >> > +
>> >> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
>> >> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
>> >> > + return;
>> >> > +
>> >> > + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
>> >> > + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
>> >> > +
>> >> > + /* Move the state of each vector entry */
>> >> > + for (i = 0; i < BIT(old_vec->order); i++) {
>> >> > + ovec = old_vec + i;
>> >> > + nvec = new_vec + i;
>> >> > +
>> >> > + /* Unmask the new vector entry */
>> >> > + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
>> >> > + bitmap_set(new_lpriv->ids_enabled_bitmap,
>> >> > + nvec->local_id, 1);
>> >> > +
>> >> > + /* Mask the old vector entry */
>> >> > + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
>> >> > +
>> >> > + /*
>> >> > + * Move and re-trigger the new vector entry based on the
>> >> > + * pending state of the old vector entry because we might
>> >> > + * get a device interrupt on the old vector entry while
>> >> > + * device was being moved to the new vector entry.
>> >> > + */
>> >> > + old_lpriv->ids_move[ovec->local_id] = nvec;
>> >> > + }
>> >>
>> >> Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
>> >> does.
>> >>
>> >> Convince me why irq matrix is not a good fit to track the interrupts IDs
>> >> *and* get handling/tracking for managed/unmanaged interrupts. You said
>> >> that it was the power-of-two blocks for MSI, but can't that be enfored
>> >> on matrix alloc? Where are you doing the special handling of MSI?
>> >>
>> >> The reason I'm asking is because I'm pretty certain that x86 has proper
>> >> MSI support (Thomas Gleixner can answer for sure! ;-))
>> >>
>> >> IMSIC smells a lot like the the LAPIC. The implementation could probably
>> >> be *very* close to what arch/x86/kernel/apic/vector.c does.
>> >>
>> >> Am I completly off here?
>> >>
>> >
>> > The x86 APIC driver only supports MSI-X due to which the IRQ matrix
>> > allocator only supports ID/Vector allocation suitable for MSI-X whereas
>> > the ARM GICv3 driver supports both legacy MSI and MSI-X. In absence
>> > of legacy MSI support, Linux x86 will fallback to INTx for PCI devices
>> > with legacy MSI support but for RISC-V platforms we can't assume that
>> > INTx is available because we might be dealing with an IMSIC-only
>> > platform.
>>
>> You're mixing up MSI and *multi-MSI* (multiple MSI vectors).
>
> So now you are doubting my understanding of MSI ?

I'm not doubting anything. Maybe we need to clarify so that we
understand each other.

You said: "The x86 APIC driver only supports MSI-X..." And that made me
think that you didn't have all the details. Sorry for making that
assumption.

Let's clear up the terminology, for our own sake:

* legacy-MSI: MSI (non-MSIX!), with *only one vector*.
* multi-MSI: MSI (non-MSIX!), with multiple vectors
* MSI-X

"MSI" can also refer to all of the above.

x86 supports legacy-MSI and MSI-X for non-remapped MSIs, and multi-MSI
with IOMMU support.

>> x86 support MSI-X, MSI, and multi-MSI with IOMMU.
>>
>> Gleixner has a some insights on why one probably should *not* jump
>> through hoops to support multi-MSI:
>> https://lore.kernel.org/all/877d7yhve7.ffs@tglx/
>
> This is a fair justification to drop why x86 does not support
> the legacy-MSI or "multi-MSI".

My claim is that x86 does support legacy-MSI, but for design decision,
has avoided multi-MSI.

AFAIU, there are few multi-MSI devices out there.

>> Will we really see HW requiring multi-MSI support on RISC-V systems
>> without IOMMU? To me this sounds like a theoretical exercise.
>>
>> > Refer, x86_vector_msi_parent_ops in arch/x86/kernel/apic/msi.c and
>> > X86_VECTOR_MSI_FLAGS_SUPPORTED in arch/x86/include/asm/msi.h
>> >
>> > Refer, its_pci_msi_domain_info in drivers/irqchip/irq-gic-v3-its-pci-msi.c
>> >
>> > The changes which I think are need in the IRQ matrix allocator before
>> > integrating it in the IMSIC driver are the following:
>> > 1) IRQ matrix allocator assumed NR_VECTORS to be a fixed define
>> > which the arch code provides but in RISC-V world the number of
>> > IDs are discovered from DT or ACPI.
>>
>> Ok, let's try to be bit more explicit. Have you had a look at
>> kernel/irq/matrix.c?
>
> Why do you doubt it ?

Again, no doubts -- I'm just trying to clarify. Sorry if that touched a
nerve!

>> You need to define the IRQ_MATRIX_BITS (which x86 sets to NR_VECTORS).
>> This is the size of the bitmap. For IMSIC this would be 2047.
>
> Wow, let's just create large bitmaps even when underlying HW has
> fewer per-CPU IDs !!!

Yeah, fair argument. It's a bit too much. Here's a patch to the matrix
allocator that fixes that. Note that it's only compile tested:

--8<--
From 2be4093a39b0560247289f8f4c8214cdacda7870 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B6rn=20T=C3=B6pel?= <[email protected]>
Date: Thu, 26 Oct 2023 10:17:21 +0200
Subject: [PATCH] genirq/matrix: Dynamic bitmap allocation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Some (future) users of the irq matrix allocator, do not know the size
of the bitmaps at compile time.

To avoid wasting memory on unneccesary large bitmaps, size the bitmap
at matrix allocation time.

Signed-off-by: Björn Töpel <[email protected]>
---
arch/x86/include/asm/hw_irq.h | 2 --
kernel/irq/matrix.c | 33 ++++++++++++++++++++++-----------
2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 551829884734..dcfaa3812306 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -16,8 +16,6 @@

#include <asm/irq_vectors.h>

-#define IRQ_MATRIX_BITS NR_VECTORS
-
#ifndef __ASSEMBLY__

#include <linux/percpu.h>
diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
index 1698e77645ac..16ce956935ca 100644
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -8,8 +8,6 @@
#include <linux/cpu.h>
#include <linux/irq.h>

-#define IRQ_MATRIX_SIZE (BITS_TO_LONGS(IRQ_MATRIX_BITS))
-
struct cpumap {
unsigned int available;
unsigned int allocated;
@@ -17,8 +15,9 @@ struct cpumap {
unsigned int managed_allocated;
bool initialized;
bool online;
- unsigned long alloc_map[IRQ_MATRIX_SIZE];
- unsigned long managed_map[IRQ_MATRIX_SIZE];
+ unsigned long *alloc_map;
+ unsigned long *managed_map;
+ unsigned long bitmap_storage[];
};

struct irq_matrix {
@@ -32,8 +31,10 @@ struct irq_matrix {
unsigned int total_allocated;
unsigned int online_maps;
struct cpumap __percpu *maps;
- unsigned long scratch_map[IRQ_MATRIX_SIZE];
- unsigned long system_map[IRQ_MATRIX_SIZE];
+ unsigned long *scratch_map;
+ unsigned long *system_map;
+ unsigned long bitmap_storage[];
+
};

#define CREATE_TRACE_POINTS
@@ -50,24 +51,34 @@ __init struct irq_matrix *irq_alloc_matrix(unsigned int matrix_bits,
unsigned int alloc_start,
unsigned int alloc_end)
{
+ unsigned int cpu, matrix_size = BITS_TO_LONGS(matrix_bits);
struct irq_matrix *m;

- if (matrix_bits > IRQ_MATRIX_BITS)
- return NULL;
-
- m = kzalloc(sizeof(*m), GFP_KERNEL);
+ m = kzalloc(struct_size(m, bitmap_storage, matrix_size * 2), GFP_KERNEL);
if (!m)
return NULL;

+ m->scratch_map = &m->bitmap_storage[0];
+ m->system_map = &m->bitmap_storage[matrix_size];
+
m->matrix_bits = matrix_bits;
m->alloc_start = alloc_start;
m->alloc_end = alloc_end;
m->alloc_size = alloc_end - alloc_start;
- m->maps = alloc_percpu(*m->maps);
+ m->maps = __alloc_percpu(struct_size(m->maps, bitmap_storage, matrix_size * 2),
+ __alignof__(*m->maps));
if (!m->maps) {
kfree(m);
return NULL;
}
+
+ for_each_possible_cpu(cpu){
+ struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+
+ cm->alloc_map = &cm->bitmap_storage[0];
+ cm->managed_map = &cm->bitmap_storage[matrix_size];
+ }
+
return m;
}


base-commit: 611da07b89fdd53f140d7b33013f255bf0ed8f34
--
2.40.1

--8<--


>> The matrix allocator is an excellent fit, modulo multi-MSI. It's battle
>> proven code.
>>
>> > 2) IRQ matrix allocator needs to be support allocator multiple vectors
>> > in power-of-2 which will allow IMSIC driver to support both legacy
>> > MSI and MSI-X. This will involve changing the way best CPU is
>> > found, the way bitmap APIs are used and adding some new APIs
>> > for allocate vectors in power-of-2
>>
>> ...and all the other things multi-MSI requires.
>>
>> > Based on above, I suggest we keep the integration of IRQ matrix
>> > allocator in the IMSIC driver as a separate series which will allow
>> > us to unblock other series (such as AIA ACPI support, power
>> > managment related changes in AIA drivers, etc).
>>
>> I suggest removing the multi-MSI support, and use the matrix allocator.
>> We have something that looks like what x86 has (IMSIC). We have a
>> battle-proven implementation, and helper function. In my view it would
>> be just weird not to piggy-back on that work, and benefit from years of
>> bugfixes/things we haven't thought of.
>>
>> Finally; I don't see that you're handling managed interrupt in the
>> series (Oh, the matrix allocator has support for that! ;-)).
>
> We don't need managed interrupts like x86 does. We are using
> IPI-mux to create multiple virtual IPIs on-top-of single ID and we
> use some of these virtual IPIs for internal managment.

I'm not following here, and what IPI's has to do with managed
interrupts. I'm referring to "IRQD_AFFINITY_MANAGED".

I'm probably missing something?


Björn

2023-10-28 18:19:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Thu, Oct 26 2023 at 10:51, Björn Töpel wrote:
>>> >> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
>>> >> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
>>> >> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
>>> >> > +
>>> >> > + imsic_remote_sync(vec->cpu);
>>> >>
>>> >> x86 seems to set a timer instead, for the remote cpu cleanup, which can
>>> >> be much cheaper, and less in instrusive. Is that applicable here?
>>> >
>>> > The issue with that approach is deciding the right duration
>>> > of timer interrupt. There might be platforms who need
>>> > immediate mask/unmask response. We can certainely
>>> > keep improving/tuning this over-time.
>>>
>>> Any concrete examples where this is an actual problem?
>>
>> Do you have a concrete timer duration with proper justification ?
>
> I would simply mimic what x86 does for now -- jiffies + 1.

That's good enough. The point is that the interrupt might still end up
on the old target CPU depending on timing, but the next one is
guaranteed to be targeted to the new target CPU.

So you can't cleanup the vector on the old target immediately, but it
does not matter at all whether you clean it up 10ms or 10s later. It's
just wasting a vector on the old target.

Doing it with an IPI (as x86 did before) only works when the IPI vector
is of lower priority than the vector which got moved. Otherwise the IPI
will be served first, find the vector pending and then it's up a creek
without a paddle because it can't retrigger the IPI as that would again
be served first. So it can't clean up ever...

The timer just avoids this and as I said the delay is completely
irrelevant.

>>> >> The reason I'm asking is because I'm pretty certain that x86 has proper
>>> >> MSI support (Thomas Gleixner can answer for sure! ;-))

It has proper MSI support with some limitations.

>>> >> IMSIC smells a lot like the the LAPIC.

Eeew. :)

> My claim is that x86 does support legacy-MSI, but for design decision,
> has avoided multi-MSI.

There are two variants of PCI/MSI:

1) MSI
2) MSI-X

Neither of them is legacy and both support multiple vectors at the
device hardware level.

#1 MSI

Affinity setting requires to move all vectors to the new target in
one go because the device gets only the base vector in the MSI
message and uses the lower bits as index.

So that's of limited use anyway because it's impossible to use
that for multi-queue or other purposes where the main point is to
spread the interrupts accross CPUs.

It does not have mandatory masking which makes affinity changes
even more problematic at least on x86 because the update to the
message store in the PCI config space is non-atomic. See the dance
which is required for a single vector in msi_set_affity().

IOW, if the MSI message is directly delivered to the target CPU
and the device does not support masking then single vector is
already complext and multi-MSI support becomes a horrorshow.

Another issue especially on x86 with the limitation of about 200
device vectors per CPU is the requirement to allocate a
consecutive vector space power of 2 aligned. That's pretty fast at
the point of vector exhaustion.

That _are_ the reasons why X86 does not support multi-MSI without
interrupt remapping. It just does the only sane thing and limits
to one vector per device.

Interrupt remapping avoids the problem because it allows to steer
the vectors individually and the affinity update is atomic. It
obviously also lifts the requirement for a consecutive vector
space.

Serioulsy w/o interrupt remapping or an equivalent translation
mechanism which allows to steer the vectors individually multi-MSI
is absolutely pointless and not worth the trouble to support it.


#2 MSI-X

Has a message store per vector and mandatory per vector masking
which makes multi vector support trivial even w/o interrupt
remapping. It does neither require a consecutive vector space.

So if AIA is similar to the APIC, then single MSI needs the same dance
and multi-MSI needs that theatre ^ N.

> AFAIU, there are few multi-MSI devices out there.

You wish. MSI-X is "more expensive" (probaly 0.5 Cent). Now that
interrupt remapping is pretty much always available on x86, the problem
is "fixed" indirectly. So especially x86 on-chip devices still use MSI
and not MSI-X. MSI-X is primarily used in multi-queue devices as
multi-MSI is limited to 32 vectors.

Thanks,

tglx

2023-10-28 18:34:49

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

On Mon, Oct 23 2023 at 22:57, Anup Patel wrote:
> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> +void imsic_vector_debug_show(struct seq_file *m,
> + struct imsic_vector *vec, int ind)
> +{
> + unsigned int mcpu = 0, mlocal_id = 0;
> + struct imsic_local_priv *lpriv;
> + bool move_in_progress = false;
> + struct imsic_vector *mvec;
> + bool is_enabled = false;
> + unsigned long flags;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
> + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
> + return;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + if (test_bit(vec->local_id, lpriv->ids_enabled_bitmap))
> + is_enabled = true;
> + mvec = lpriv->ids_move[vec->local_id];
> + if (mvec) {
> + move_in_progress = true;
> + mcpu = mvec->cpu;
> + mlocal_id = mvec->local_id;
> + }
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> +
> + seq_printf(m, "%*starget_cpu : %5u\n", ind, "", vec->cpu);
> + seq_printf(m, "%*starget_local_id : %5u\n", ind, "", vec->local_id);
> + seq_printf(m, "%*sis_reserved : %5u\n", ind, "",
> + (vec->local_id <= IMSIC_IPI_ID) ? 1 : 0);
> + seq_printf(m, "%*sis_enabled : %5u\n", ind, "",
> + (move_in_progress) ? 1 : 0);
> + seq_printf(m, "%*sis_move_pending : %5u\n", ind, "",
> + (move_in_progress) ? 1 : 0);
> + if (move_in_progress) {
> + seq_printf(m, "%*smove_cpu : %5u\n", ind, "", mcpu);
> + seq_printf(m, "%*smove_local_id : %5u\n", ind, "", mlocal_id);
> + }
> +}
> +
> +void imsic_vector_debug_show_summary(struct seq_file *m, int ind)
> +{
> + unsigned int cpu, total_avail = 0, total_used = 0;
> + struct imsic_global_config *global = &imsic->global;
> + struct imsic_local_priv *lpriv;
> + unsigned long flags;
> +
> + for_each_possible_cpu(cpu) {
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> +
> + total_avail += global->nr_ids;
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + total_used += bitmap_weight(lpriv->ids_used_bitmap,
> + global->nr_ids + 1) - 1;
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> + }
> +
> + seq_printf(m, "%*stotal : %5u\n", ind, "", total_avail);
> + seq_printf(m, "%*sused : %5u\n", ind, "", total_used);
> + seq_printf(m, "%*s| CPU | tot | usd | vectors\n", ind, " ");
> +
> + cpus_read_lock();
> + for_each_online_cpu(cpu) {
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> +
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + total_used = bitmap_weight(lpriv->ids_used_bitmap,
> + global->nr_ids + 1) - 1;
> + seq_printf(m, "%*s %4d %4u %4u %*pbl\n", ind, " ",
> + cpu, global->nr_ids, total_used,
> + global->nr_ids + 1, lpriv->ids_used_bitmap);
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> + }
> + cpus_read_unlock();

This looks very close to the matrix alocator information, just done differently.

> +static unsigned int imsic_vector_best_cpu(const struct cpumask *mask,
> + unsigned int order)
> +{
> + struct imsic_global_config *global = &imsic->global;
> + unsigned int cpu, best_cpu, free, maxfree = 0;
> + struct imsic_local_priv *lpriv;
> + unsigned long flags;
> +
> + best_cpu = UINT_MAX;
> + for_each_cpu(cpu, mask) {
> + if (!cpu_online(cpu))
> + continue;
> +
> + lpriv = per_cpu_ptr(imsic->lpriv, cpu);
> + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
> + free = bitmap_weight(lpriv->ids_used_bitmap,
> + global->nr_ids + 1);
> + free = (global->nr_ids + 1) - free;
> + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
> + if (free < BIT(order) || free <= maxfree)
> + continue;
> +
> + best_cpu = cpu;
> + maxfree = free;
> + }
> +
> + return best_cpu;

Looks very much like what the matrix allocator provides, right?

What's the actual reason that you can't use it?

Thanks,

tglx

2023-10-28 18:36:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

On Wed, Oct 25 2023 at 10:55, Björn Töpel wrote:
>> Now for IMSIC-PCI domain, the PCI framework expects the
>> pci_msi_mask/unmask_irq() functions to be called but if
>> we directly point pci_msi_mask/unmask_irq() in the IMSIC-PCI
>> irqchip then IMSIC-BASE (parent domain) irq_mask/umask
>> won't be called hence the IRQ won't be masked/unmask.
>> Due to this, we call both pci_msi_mask/unmask_irq() and
>> irq_chip_mask/unmask_parent() for IMSIC-PCI domain.
>
> Ok. I wont dig more into it for now! If the interrupt is disabled at
> PCI, it seems a bit overkill to *also* mask it at the IMSIC level...

Only _if_ the device provides MSI masking, but that extra mask/unmask is
not the end of the world.

Thanks,

tglx

2023-10-29 19:53:45

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v11 09/14] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

Thomas Gleixner <[email protected]> writes:

> On Wed, Oct 25 2023 at 10:55, Björn Töpel wrote:
>>> Now for IMSIC-PCI domain, the PCI framework expects the
>>> pci_msi_mask/unmask_irq() functions to be called but if
>>> we directly point pci_msi_mask/unmask_irq() in the IMSIC-PCI
>>> irqchip then IMSIC-BASE (parent domain) irq_mask/umask
>>> won't be called hence the IRQ won't be masked/unmask.
>>> Due to this, we call both pci_msi_mask/unmask_irq() and
>>> irq_chip_mask/unmask_parent() for IMSIC-PCI domain.
>>
>> Ok. I wont dig more into it for now! If the interrupt is disabled at
>> PCI, it seems a bit overkill to *also* mask it at the IMSIC level...
>
> Only _if_ the device provides MSI masking, but that extra mask/unmask is
> not the end of the world.

Yikes -- so MSI masking is optional. Ick. :-( Thanks for the excellent
MSI vs MSI-X post in the other thread, BTW. Great stuff!

2023-11-02 06:40:06

by Ben

[permalink] [raw]
Subject: Re:[PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode


At 2023-10-24 01:27:58, "Anup Patel" <[email protected]> wrote:
>The RISC-V advanced platform-level interrupt controller (APLIC) has
>two modes of operation: 1) Direct mode and 2) MSI mode.
>(For more details, refer https://github.com/riscv/riscv-aia)
>
>In APLIC MSI-mode, wired interrupts are forwared as message signaled
>interrupts (MSIs) to CPUs via IMSIC.
>
>We extend the existing APLIC irqchip driver to support MSI-mode for
>RISC-V platforms having both wired interrupts and MSIs.
>
>Signed-off-by: Anup Patel <[email protected]>
>---
> drivers/irqchip/Kconfig | 6 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
> drivers/irqchip/irq-riscv-aplic-main.h | 8 +
> drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
> 5 files changed, 301 insertions(+), 1 deletion(-)
> create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
>
>diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
>index 1996cc6f666a..7adc4dbe07ff 100644
>--- a/drivers/irqchip/Kconfig
>+++ b/drivers/irqchip/Kconfig
>@@ -551,6 +551,12 @@ config RISCV_APLIC
> depends on RISCV
> select IRQ_DOMAIN_HIERARCHY
>
>+config RISCV_APLIC_MSI
>+ bool
>+ depends on RISCV_APLIC
>+ select GENERIC_MSI_IRQ
>+ default RISCV_APLIC
>+
> config RISCV_IMSIC
> bool
> depends on RISCV
>diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
>index 7f8289790ed8..47995fdb2c60 100644
>--- a/drivers/irqchip/Makefile
>+++ b/drivers/irqchip/Makefile
>@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
>+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
> obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
>diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
>index 87450708a733..d1b342b66551 100644
>--- a/drivers/irqchip/irq-riscv-aplic-main.c
>+++ b/drivers/irqchip/irq-riscv-aplic-main.c
>@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
> msi_mode = of_property_present(to_of_node(dev->fwnode),
> "msi-parent");
> if (msi_mode)
>- rc = -ENODEV;
>+ rc = aplic_msi_setup(dev, regs);
> else
> rc = aplic_direct_setup(dev, regs);
> if (rc) {
>diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
>index 474a04229334..78267ec58098 100644
>--- a/drivers/irqchip/irq-riscv-aplic-main.h
>+++ b/drivers/irqchip/irq-riscv-aplic-main.h
>@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
> int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
> void __iomem *regs);
> int aplic_direct_setup(struct device *dev, void __iomem *regs);
>+#ifdef CONFIG_RISCV_APLIC_MSI
>+int aplic_msi_setup(struct device *dev, void __iomem *regs);
>+#else
>+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
>+{
>+ return -ENODEV;
>+}
>+#endif
>
> #endif
>diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
>new file mode 100644
>index 000000000000..086d00e0429e
>--- /dev/null
>+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
>@@ -0,0 +1,285 @@
>+// SPDX-License-Identifier: GPL-2.0
>+/*
>+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
>+ * Copyright (C) 2022 Ventana Micro Systems Inc.
>+ */
>+
>+#include <linux/bitops.h>
>+#include <linux/cpu.h>
>+#include <linux/interrupt.h>
>+#include <linux/irqchip.h>
>+#include <linux/irqchip/riscv-aplic.h>
>+#include <linux/irqchip/riscv-imsic.h>
>+#include <linux/module.h>
>+#include <linux/msi.h>
>+#include <linux/of_irq.h>
>+#include <linux/platform_device.h>
>+#include <linux/printk.h>
>+#include <linux/smp.h>
>+
>+#include "irq-riscv-aplic-main.h"
>+
>+static void aplic_msi_irq_unmask(struct irq_data *d)
>+{
>+ aplic_irq_unmask(d);
>+ irq_chip_unmask_parent(d);
>+}
>+
>+static void aplic_msi_irq_mask(struct irq_data *d)
>+{
>+ aplic_irq_mask(d);
>+ irq_chip_mask_parent(d);
>+}
>+
>+static void aplic_msi_irq_eoi(struct irq_data *d)
>+{
>+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
>+ u32 reg_off, reg_mask;
>+
>+ /*
>+ * EOI handling only required only for level-triggered
>+ * interrupts in APLIC MSI mode.
>+ */
>+
>+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
>+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
>+ switch (irqd_get_trigger_type(d)) {
>+ case IRQ_TYPE_LEVEL_LOW:
>+ if (!(readl(priv->regs + reg_off) & reg_mask))
>+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>+ break;
>+ case IRQ_TYPE_LEVEL_HIGH:
>+ if (readl(priv->regs + reg_off) & reg_mask)
>+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>+ break;
>+ }
>+}
>+
>+static struct irq_chip aplic_msi_chip = {
>+ .name = "APLIC-MSI",
>+ .irq_mask = aplic_msi_irq_mask,
>+ .irq_unmask = aplic_msi_irq_unmask,
>+ .irq_set_type = aplic_irq_set_type,
>+ .irq_eoi = aplic_msi_irq_eoi,
>+#ifdef CONFIG_SMP
>+ .irq_set_affinity = irq_chip_set_affinity_parent,
>+#endif
>+ .flags = IRQCHIP_SET_TYPE_MASKED |
>+ IRQCHIP_SKIP_SET_WAKE |
>+ IRQCHIP_MASK_ON_SUSPEND,
>+};
>+
>+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
>+ struct irq_fwspec *fwspec,
>+ unsigned long *hwirq,
>+ unsigned int *type)
>+{
>+ struct aplic_priv *priv = platform_msi_get_host_data(d);
>+
>+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
>+}
>+
>+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
>+ unsigned int virq, unsigned int nr_irqs,
>+ void *arg)
>+{
>+ int i, ret;
>+ unsigned int type;
>+ irq_hw_number_t hwirq;
>+ struct irq_fwspec *fwspec = arg;
>+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
>+
>+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
>+ if (ret)
>+ return ret;
>+
>+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
>+ if (ret)
>+ return ret;
>+
>+ for (i = 0; i < nr_irqs; i++) {
>+ irq_domain_set_info(domain, virq + i, hwirq + i,
>+ &aplic_msi_chip, priv, handle_fasteoi_irq,
>+ NULL, NULL);
>+ /*
>+ * APLIC does not implement irq_disable() so Linux interrupt
>+ * subsystem will take a lazy approach for disabling an APLIC
>+ * interrupt. This means APLIC interrupts are left unmasked
>+ * upon system suspend and interrupts are not processed
>+ * immediately upon system wake up. To tackle this, we disable
>+ * the lazy approach for all APLIC interrupts.
>+ */
>+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);

>+ }


For platfrom MSI irq, it will call irq_domain_set_info() and irq_set_status_flags() twice, the first one is here:
platform_msi_device_domain_alloc->msi_domain_populate_irqs->irq_domain_alloc_irqs_hierarchy->imsic_irq_domain_alloc->irq_domain_set_info


so i think here this for(...) is not necessary, can be removed.




2023-11-02 12:38:48

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode

On Thu, Nov 2, 2023 at 11:55 AM Ben <[email protected]> wrote:
>
>
> At 2023-10-24 01:27:58, "Anup Patel" <[email protected]> wrote:
> >The RISC-V advanced platform-level interrupt controller (APLIC) has
> >two modes of operation: 1) Direct mode and 2) MSI mode.
> >(For more details, refer https://github.com/riscv/riscv-aia)
> >
> >In APLIC MSI-mode, wired interrupts are forwared as message signaled
> >interrupts (MSIs) to CPUs via IMSIC.
> >
> >We extend the existing APLIC irqchip driver to support MSI-mode for
> >RISC-V platforms having both wired interrupts and MSIs.
> >
> >Signed-off-by: Anup Patel <[email protected]>
> >---
> > drivers/irqchip/Kconfig | 6 +
> > drivers/irqchip/Makefile | 1 +
> > drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
> > drivers/irqchip/irq-riscv-aplic-main.h | 8 +
> > drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
> > 5 files changed, 301 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
> >
> >diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> >index 1996cc6f666a..7adc4dbe07ff 100644
> >--- a/drivers/irqchip/Kconfig
> >+++ b/drivers/irqchip/Kconfig
> >@@ -551,6 +551,12 @@ config RISCV_APLIC
> > depends on RISCV
> > select IRQ_DOMAIN_HIERARCHY
> >
> >+config RISCV_APLIC_MSI
> >+ bool
> >+ depends on RISCV_APLIC
> >+ select GENERIC_MSI_IRQ
> >+ default RISCV_APLIC
> >+
> > config RISCV_IMSIC
> > bool
> > depends on RISCV
> >diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> >index 7f8289790ed8..47995fdb2c60 100644
> >--- a/drivers/irqchip/Makefile
> >+++ b/drivers/irqchip/Makefile
> >@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
> >+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
> > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
> >index 87450708a733..d1b342b66551 100644
> >--- a/drivers/irqchip/irq-riscv-aplic-main.c
> >+++ b/drivers/irqchip/irq-riscv-aplic-main.c
> >@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
> > msi_mode = of_property_present(to_of_node(dev->fwnode),
> > "msi-parent");
> > if (msi_mode)
> >- rc = -ENODEV;
> >+ rc = aplic_msi_setup(dev, regs);
> > else
> > rc = aplic_direct_setup(dev, regs);
> > if (rc) {
> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
> >index 474a04229334..78267ec58098 100644
> >--- a/drivers/irqchip/irq-riscv-aplic-main.h
> >+++ b/drivers/irqchip/irq-riscv-aplic-main.h
> >@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
> > int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
> > void __iomem *regs);
> > int aplic_direct_setup(struct device *dev, void __iomem *regs);
> >+#ifdef CONFIG_RISCV_APLIC_MSI
> >+int aplic_msi_setup(struct device *dev, void __iomem *regs);
> >+#else
> >+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
> >+{
> >+ return -ENODEV;
> >+}
> >+#endif
> >
> > #endif
> >diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
> >new file mode 100644
> >index 000000000000..086d00e0429e
> >--- /dev/null
> >+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
> >@@ -0,0 +1,285 @@
> >+// SPDX-License-Identifier: GPL-2.0
> >+/*
> >+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> >+ * Copyright (C) 2022 Ventana Micro Systems Inc.
> >+ */
> >+
> >+#include <linux/bitops.h>
> >+#include <linux/cpu.h>
> >+#include <linux/interrupt.h>
> >+#include <linux/irqchip.h>
> >+#include <linux/irqchip/riscv-aplic.h>
> >+#include <linux/irqchip/riscv-imsic.h>
> >+#include <linux/module.h>
> >+#include <linux/msi.h>
> >+#include <linux/of_irq.h>
> >+#include <linux/platform_device.h>
> >+#include <linux/printk.h>
> >+#include <linux/smp.h>
> >+
> >+#include "irq-riscv-aplic-main.h"
> >+
> >+static void aplic_msi_irq_unmask(struct irq_data *d)
> >+{
> >+ aplic_irq_unmask(d);
> >+ irq_chip_unmask_parent(d);
> >+}
> >+
> >+static void aplic_msi_irq_mask(struct irq_data *d)
> >+{
> >+ aplic_irq_mask(d);
> >+ irq_chip_mask_parent(d);
> >+}
> >+
> >+static void aplic_msi_irq_eoi(struct irq_data *d)
> >+{
> >+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> >+ u32 reg_off, reg_mask;
> >+
> >+ /*
> >+ * EOI handling only required only for level-triggered
> >+ * interrupts in APLIC MSI mode.
> >+ */
> >+
> >+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> >+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> >+ switch (irqd_get_trigger_type(d)) {
> >+ case IRQ_TYPE_LEVEL_LOW:
> >+ if (!(readl(priv->regs + reg_off) & reg_mask))
> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> >+ break;
> >+ case IRQ_TYPE_LEVEL_HIGH:
> >+ if (readl(priv->regs + reg_off) & reg_mask)
> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> >+ break;
> >+ }
> >+}
> >+
> >+static struct irq_chip aplic_msi_chip = {
> >+ .name = "APLIC-MSI",
> >+ .irq_mask = aplic_msi_irq_mask,
> >+ .irq_unmask = aplic_msi_irq_unmask,
> >+ .irq_set_type = aplic_irq_set_type,
> >+ .irq_eoi = aplic_msi_irq_eoi,
> >+#ifdef CONFIG_SMP
> >+ .irq_set_affinity = irq_chip_set_affinity_parent,
> >+#endif
> >+ .flags = IRQCHIP_SET_TYPE_MASKED |
> >+ IRQCHIP_SKIP_SET_WAKE |
> >+ IRQCHIP_MASK_ON_SUSPEND,
> >+};
> >+
> >+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
> >+ struct irq_fwspec *fwspec,
> >+ unsigned long *hwirq,
> >+ unsigned int *type)
> >+{
> >+ struct aplic_priv *priv = platform_msi_get_host_data(d);
> >+
> >+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> >+}
> >+
> >+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
> >+ unsigned int virq, unsigned int nr_irqs,
> >+ void *arg)
> >+{
> >+ int i, ret;
> >+ unsigned int type;
> >+ irq_hw_number_t hwirq;
> >+ struct irq_fwspec *fwspec = arg;
> >+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
> >+
> >+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> >+ if (ret)
> >+ return ret;
> >+
> >+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> >+ if (ret)
> >+ return ret;
> >+
> >+ for (i = 0; i < nr_irqs; i++) {
> >+ irq_domain_set_info(domain, virq + i, hwirq + i,
> >+ &aplic_msi_chip, priv, handle_fasteoi_irq,
> >+ NULL, NULL);
> >+ /*
> >+ * APLIC does not implement irq_disable() so Linux interrupt
> >+ * subsystem will take a lazy approach for disabling an APLIC
> >+ * interrupt. This means APLIC interrupts are left unmasked
> >+ * upon system suspend and interrupts are not processed
> >+ * immediately upon system wake up. To tackle this, we disable
> >+ * the lazy approach for all APLIC interrupts.
> >+ */
> >+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> >+ }
>
> For platfrom MSI irq, it will call irq_domain_set_info() and irq_set_status_flags() twice, the first one is here:
> platform_msi_device_domain_alloc->msi_domain_populate_irqs->irq_domain_alloc_irqs_hierarchy->imsic_irq_domain_alloc->irq_domain_set_info
>
> so i think here this for(...) is not necessary, can be removed.

If we remove then it breaks APLIC MSI-mode because we have
hierarchical irq domains where the APLIC-MSI domain is a child
of the IMSIC-PLAT domain.

The irq_domain_set_info() called by IMSIC driver only sets irqchip
for IMSIC irq whereas irq_domain_set_info() called by APLIC driver
sets irqchip for APLIC irq. We use a different APLIC irqchip for the
APLIC domain to mask, unmask, and eoi irqs in an APLIC specific
way.

Regards,
Anup


>
>
> >+
> >+ return 0;
> >+}
> >+
> >+static const struct irq_domain_ops aplic_msi_irqdomain_ops = {
> >+ .translate = aplic_msi_irqdomain_translate,
> >+ .alloc = aplic_msi_irqdomain_alloc,
> >+ .free = platform_msi_device_domain_free,
> >+};
> >+
> >+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> >+{
> >+ unsigned int group_index, hart_index, guest_index, val;
> >+ struct irq_data *d = irq_get_irq_data(desc->irq);
> >+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> >+ struct aplic_msicfg *mc = &priv->msicfg;
> >+ phys_addr_t tppn, tbppn, msg_addr;
> >+ void __iomem *target;
> >+
> >+ /* For zeroed MSI, simply write zero into the target register */
> >+ if (!msg->address_hi && !msg->address_lo && !msg->data) {
> >+ target = priv->regs + APLIC_TARGET_BASE;
> >+ target += (d->hwirq - 1) * sizeof(u32);
> >+ writel(0, target);
> >+ return;
> >+ }
> >+
> >+ /* Sanity check on message data */
> >+ WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> >+
> >+ /* Compute target MSI address */
> >+ msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> >+ tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> >+
> >+ /* Compute target HART Base PPN */
> >+ tbppn = tppn;
> >+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> >+ tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> >+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> >+ WARN_ON(tbppn != mc->base_ppn);
> >+
> >+ /* Compute target group and hart indexes */
> >+ group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> >+ APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> >+ hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> >+ APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> >+ hart_index |= (group_index << mc->lhxw);
> >+ WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> >+
> >+ /* Compute target guest index */
> >+ guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> >+ WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> >+
> >+ /* Update IRQ TARGET register */
> >+ target = priv->regs + APLIC_TARGET_BASE;
> >+ target += (d->hwirq - 1) * sizeof(u32);
> >+ val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> >+ << APLIC_TARGET_HART_IDX_SHIFT;
> >+ val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> >+ << APLIC_TARGET_GUEST_IDX_SHIFT;
> >+ val |= (msg->data & APLIC_TARGET_EIID_MASK);
> >+ writel(val, target);
> >+}
> >+
> >+int aplic_msi_setup(struct device *dev, void __iomem *regs)
> >+{
> >+ const struct imsic_global_config *imsic_global;
> >+ struct irq_domain *irqdomain;
> >+ struct aplic_priv *priv;
> >+ struct aplic_msicfg *mc;
> >+ phys_addr_t pa;
> >+ int rc;
> >+
> >+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> >+ if (!priv)
> >+ return -ENOMEM;
> >+
> >+ rc = aplic_setup_priv(priv, dev, regs);
> >+ if (!priv) {
> >+ dev_err(dev, "failed to create APLIC context\n");
> >+ return rc;
> >+ }
> >+ mc = &priv->msicfg;
> >+
> >+ /*
> >+ * The APLIC outgoing MSI config registers assume target MSI
> >+ * controller to be RISC-V AIA IMSIC controller.
> >+ */
> >+ imsic_global = imsic_get_global_config();
> >+ if (!imsic_global) {
> >+ dev_err(dev, "IMSIC global config not found\n");
> >+ return -ENODEV;
> >+ }
> >+
> >+ /* Find number of guest index bits (LHXS) */
> >+ mc->lhxs = imsic_global->guest_index_bits;
> >+ if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> >+ dev_err(dev, "IMSIC guest index bits big for APLIC LHXS\n");
> >+ return -EINVAL;
> >+ }
> >+
> >+ /* Find number of HART index bits (LHXW) */
> >+ mc->lhxw = imsic_global->hart_index_bits;
> >+ if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> >+ dev_err(dev, "IMSIC hart index bits big for APLIC LHXW\n");
> >+ return -EINVAL;
> >+ }
> >+
> >+ /* Find number of group index bits (HHXW) */
> >+ mc->hhxw = imsic_global->group_index_bits;
> >+ if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> >+ dev_err(dev, "IMSIC group index bits big for APLIC HHXW\n");
> >+ return -EINVAL;
> >+ }
> >+
> >+ /* Find first bit position of group index (HHXS) */
> >+ mc->hhxs = imsic_global->group_index_shift;
> >+ if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> >+ dev_err(dev, "IMSIC group index shift should be >= %d\n",
> >+ (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> >+ return -EINVAL;
> >+ }
> >+ mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> >+ if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> >+ dev_err(dev, "IMSIC group index shift big for APLIC HHXS\n");
> >+ return -EINVAL;
> >+ }
> >+
> >+ /* Compute PPN base */
> >+ mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> >+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> >+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> >+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> >+
> >+ /* Setup global config and interrupt delivery */
> >+ aplic_init_hw_global(priv, true);
> >+
> >+ /* Set the APLIC device MSI domain if not available */
> >+ if (!dev_get_msi_domain(dev)) {
> >+ /*
> >+ * The device MSI domain for OF devices is only set at the
> >+ * time of populating/creating OF device. If the device MSI
> >+ * domain is discovered later after the OF device is created
> >+ * then we need to set it explicitly before using any platform
> >+ * MSI functions.
> >+ *
> >+ * In case of APLIC device, the parent MSI domain is always
> >+ * IMSIC and the IMSIC MSI domains are created later through
> >+ * the platform driver probing so we set it explicitly here.
> >+ */
> >+ if (is_of_node(dev->fwnode))
> >+ of_msi_configure(dev, to_of_node(dev->fwnode));
> >+ }
> >+
> >+ /* Create irq domain instance for the APLIC MSI-mode */
> >+ irqdomain = platform_msi_create_device_domain(
> >+ dev, priv->nr_irqs + 1,
> >+ aplic_msi_write_msg,
> >+ &aplic_msi_irqdomain_ops,
> >+ priv);
> >+ if (!irqdomain) {
> >+ dev_err(dev, "failed to create MSI irq domain\n");
> >+ return -ENOMEM;
> >+ }
> >+
> >+ /* Advertise the interrupt controller */
> >+ pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> >+ dev_info(dev, "%d interrupts forwared to MSI base %pa\n",
> >+ priv->nr_irqs, &pa);
> >+
> >+ return 0;
> >+}
> >--
> >2.34.1
> >
> >
> >_______________________________________________
> >linux-riscv mailing list
> >[email protected]
> >http://lists.infradead.org/mailman/listinfo/linux-riscv

2023-11-03 09:44:47

by Ben

[permalink] [raw]
Subject: Re:Re: [PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode



在 2023-11-02 20:37:42,"Anup Patel" <[email protected]> 写道:
>On Thu, Nov 2, 2023 at 11:55 AM Ben <[email protected]> wrote:
>>
>>
>> At 2023-10-24 01:27:58, "Anup Patel" <[email protected]> wrote:
>> >The RISC-V advanced platform-level interrupt controller (APLIC) has
>> >two modes of operation: 1) Direct mode and 2) MSI mode.
>> >(For more details, refer https://github.com/riscv/riscv-aia)
>> >
>> >In APLIC MSI-mode, wired interrupts are forwared as message signaled
>> >interrupts (MSIs) to CPUs via IMSIC.
>> >
>> >We extend the existing APLIC irqchip driver to support MSI-mode for
>> >RISC-V platforms having both wired interrupts and MSIs.
>> >
>> >Signed-off-by: Anup Patel <[email protected]>
>> >---
>> > drivers/irqchip/Kconfig | 6 +
>> > drivers/irqchip/Makefile | 1 +
>> > drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
>> > drivers/irqchip/irq-riscv-aplic-main.h | 8 +
>> > drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
>> > 5 files changed, 301 insertions(+), 1 deletion(-)
>> > create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
>> >
>> >diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
>> >index 1996cc6f666a..7adc4dbe07ff 100644
>> >--- a/drivers/irqchip/Kconfig
>> >+++ b/drivers/irqchip/Kconfig
>> >@@ -551,6 +551,12 @@ config RISCV_APLIC
>> > depends on RISCV
>> > select IRQ_DOMAIN_HIERARCHY
>> >
>> >+config RISCV_APLIC_MSI
>> >+ bool
>> >+ depends on RISCV_APLIC
>> >+ select GENERIC_MSI_IRQ
>> >+ default RISCV_APLIC
>> >+
>> > config RISCV_IMSIC
>> > bool
>> > depends on RISCV
>> >diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
>> >index 7f8289790ed8..47995fdb2c60 100644
>> >--- a/drivers/irqchip/Makefile
>> >+++ b/drivers/irqchip/Makefile
>> >@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
>> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
>> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
>> > obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
>> >+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
>> > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
>> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
>> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
>> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
>> >index 87450708a733..d1b342b66551 100644
>> >--- a/drivers/irqchip/irq-riscv-aplic-main.c
>> >+++ b/drivers/irqchip/irq-riscv-aplic-main.c
>> >@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
>> > msi_mode = of_property_present(to_of_node(dev->fwnode),
>> > "msi-parent");
>> > if (msi_mode)
>> >- rc = -ENODEV;
>> >+ rc = aplic_msi_setup(dev, regs);
>> > else
>> > rc = aplic_direct_setup(dev, regs);
>> > if (rc) {
>> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
>> >index 474a04229334..78267ec58098 100644
>> >--- a/drivers/irqchip/irq-riscv-aplic-main.h
>> >+++ b/drivers/irqchip/irq-riscv-aplic-main.h
>> >@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
>> > int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
>> > void __iomem *regs);
>> > int aplic_direct_setup(struct device *dev, void __iomem *regs);
>> >+#ifdef CONFIG_RISCV_APLIC_MSI
>> >+int aplic_msi_setup(struct device *dev, void __iomem *regs);
>> >+#else
>> >+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
>> >+{
>> >+ return -ENODEV;
>> >+}
>> >+#endif
>> >
>> > #endif
>> >diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
>> >new file mode 100644
>> >index 000000000000..086d00e0429e
>> >--- /dev/null
>> >+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
>> >@@ -0,0 +1,285 @@
>> >+// SPDX-License-Identifier: GPL-2.0
>> >+/*
>> >+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
>> >+ * Copyright (C) 2022 Ventana Micro Systems Inc.
>> >+ */
>> >+
>> >+#include <linux/bitops.h>
>> >+#include <linux/cpu.h>
>> >+#include <linux/interrupt.h>
>> >+#include <linux/irqchip.h>
>> >+#include <linux/irqchip/riscv-aplic.h>
>> >+#include <linux/irqchip/riscv-imsic.h>
>> >+#include <linux/module.h>
>> >+#include <linux/msi.h>
>> >+#include <linux/of_irq.h>
>> >+#include <linux/platform_device.h>
>> >+#include <linux/printk.h>
>> >+#include <linux/smp.h>
>> >+
>> >+#include "irq-riscv-aplic-main.h"
>> >+
>> >+static void aplic_msi_irq_unmask(struct irq_data *d)
>> >+{
>> >+ aplic_irq_unmask(d);
>> >+ irq_chip_unmask_parent(d);
>> >+}
>> >+
>> >+static void aplic_msi_irq_mask(struct irq_data *d)
>> >+{
>> >+ aplic_irq_mask(d);
>> >+ irq_chip_mask_parent(d);
>> >+}
>> >+
>> >+static void aplic_msi_irq_eoi(struct irq_data *d)
>> >+{
>> >+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
>> >+ u32 reg_off, reg_mask;
>> >+
>> >+ /*
>> >+ * EOI handling only required only for level-triggered
>> >+ * interrupts in APLIC MSI mode.
>> >+ */
>> >+
>> >+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
>> >+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
>> >+ switch (irqd_get_trigger_type(d)) {
>> >+ case IRQ_TYPE_LEVEL_LOW:
>> >+ if (!(readl(priv->regs + reg_off) & reg_mask))
>> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>> >+ break;
>> >+ case IRQ_TYPE_LEVEL_HIGH:
>> >+ if (readl(priv->regs + reg_off) & reg_mask)
>> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>> >+ break;
>> >+ }
>> >+}
>> >+
>> >+static struct irq_chip aplic_msi_chip = {
>> >+ .name = "APLIC-MSI",
>> >+ .irq_mask = aplic_msi_irq_mask,
>> >+ .irq_unmask = aplic_msi_irq_unmask,
>> >+ .irq_set_type = aplic_irq_set_type,
>> >+ .irq_eoi = aplic_msi_irq_eoi,
>> >+#ifdef CONFIG_SMP
>> >+ .irq_set_affinity = irq_chip_set_affinity_parent,
>> >+#endif
>> >+ .flags = IRQCHIP_SET_TYPE_MASKED |
>> >+ IRQCHIP_SKIP_SET_WAKE |
>> >+ IRQCHIP_MASK_ON_SUSPEND,
>> >+};
>> >+
>> >+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
>> >+ struct irq_fwspec *fwspec,
>> >+ unsigned long *hwirq,
>> >+ unsigned int *type)
>> >+{
>> >+ struct aplic_priv *priv = platform_msi_get_host_data(d);
>> >+
>> >+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
>> >+}
>> >+
>> >+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
>> >+ unsigned int virq, unsigned int nr_irqs,
>> >+ void *arg)
>> >+{
>> >+ int i, ret;
>> >+ unsigned int type;
>> >+ irq_hw_number_t hwirq;
>> >+ struct irq_fwspec *fwspec = arg;
>> >+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
>> >+
>> >+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
>> >+ if (ret)
>> >+ return ret;
>> >+
>> >+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
>> >+ if (ret)
>> >+ return ret;
>> >+
>> >+ for (i = 0; i < nr_irqs; i++) {
>> >+ irq_domain_set_info(domain, virq + i, hwirq + i,
>> >+ &aplic_msi_chip, priv, handle_fasteoi_irq,
>> >+ NULL, NULL);
>> >+ /*
>> >+ * APLIC does not implement irq_disable() so Linux interrupt
>> >+ * subsystem will take a lazy approach for disabling an APLIC
>> >+ * interrupt. This means APLIC interrupts are left unmasked
>> >+ * upon system suspend and interrupts are not processed
>> >+ * immediately upon system wake up. To tackle this, we disable
>> >+ * the lazy approach for all APLIC interrupts.
>> >+ */
>> >+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
>> >+ }
>>
>> For platfrom MSI irq, it will call irq_domain_set_info() and irq_set_status_flags() twice, the first one is here:
>> platform_msi_device_domain_alloc->msi_domain_populate_irqs->irq_domain_alloc_irqs_hierarchy->imsic_irq_domain_alloc->irq_domain_set_info
>>
>> so i think here this for(...) is not necessary, can be removed.
>
>If we remove then it breaks APLIC MSI-mode because we have
>hierarchical irq domains where the APLIC-MSI domain is a child
>of the IMSIC-PLAT domain.
>
>The irq_domain_set_info() called by IMSIC driver only sets irqchip
>for IMSIC irq whereas irq_domain_set_info() called by APLIC driver
>sets irqchip for APLIC irq. We use a different APLIC irqchip for the
>APLIC domain to mask, unmask, and eoi irqs in an APLIC specific
>way.
>

As your said APLIC-MSI domain is a child of the IMSIC-PLAT domain, so all of platform IRQ or wired IRQ will go to APLIC-MSI domain firstly.
how about the pure MSI interrupt? for example the MSI of PCIe device or device driver call platform_msi_domain_alloc_irqs() to allocate a MSI ?
in this scenario, it also go into APLIC-MSI domain firstly?

would you like provide the steps how to test the PCI MSI for your patchset on QEMU? i run a QEMU system, but i cannot found any PCI devices using MSI, especially the virtio devices which using the platform IRQ.

~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
10: 38972 38946 38882 38924 RISC-V INTC 5 Edge riscv-timer
11: 0 1149 0 0 APLIC-MSI 8 Level virtio0
12: 0 0 21 0 APLIC-MSI 7 Level virtio1
13: 149 0 0 218 APLIC-MSI 10 Level ttyS0
IPI0: 40 53 43 50 Rescheduling interrupts
IPI1: 7518 8899 6679 7959 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 IRQ work interrupts
IPI5: 0 0 0 0 Timer broadcast interrupts


2023-11-03 11:04:55

by Anup Patel

[permalink] [raw]
Subject: Re: Re: [PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode

On Fri, Nov 3, 2023 at 3:14 PM Ben <[email protected]> wrote:
>
>
>
> 在 2023-11-02 20:37:42,"Anup Patel" <[email protected]> 写道:
> >On Thu, Nov 2, 2023 at 11:55 AM Ben <[email protected]> wrote:
> >>
> >>
> >> At 2023-10-24 01:27:58, "Anup Patel" <[email protected]> wrote:
> >> >The RISC-V advanced platform-level interrupt controller (APLIC) has
> >> >two modes of operation: 1) Direct mode and 2) MSI mode.
> >> >(For more details, refer https://github.com/riscv/riscv-aia)
> >> >
> >> >In APLIC MSI-mode, wired interrupts are forwared as message signaled
> >> >interrupts (MSIs) to CPUs via IMSIC.
> >> >
> >> >We extend the existing APLIC irqchip driver to support MSI-mode for
> >> >RISC-V platforms having both wired interrupts and MSIs.
> >> >
> >> >Signed-off-by: Anup Patel <[email protected]>
> >> >---
> >> > drivers/irqchip/Kconfig | 6 +
> >> > drivers/irqchip/Makefile | 1 +
> >> > drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
> >> > drivers/irqchip/irq-riscv-aplic-main.h | 8 +
> >> > drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
> >> > 5 files changed, 301 insertions(+), 1 deletion(-)
> >> > create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
> >> >
> >> >diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> >> >index 1996cc6f666a..7adc4dbe07ff 100644
> >> >--- a/drivers/irqchip/Kconfig
> >> >+++ b/drivers/irqchip/Kconfig
> >> >@@ -551,6 +551,12 @@ config RISCV_APLIC
> >> > depends on RISCV
> >> > select IRQ_DOMAIN_HIERARCHY
> >> >
> >> >+config RISCV_APLIC_MSI
> >> >+ bool
> >> >+ depends on RISCV_APLIC
> >> >+ select GENERIC_MSI_IRQ
> >> >+ default RISCV_APLIC
> >> >+
> >> > config RISCV_IMSIC
> >> > bool
> >> > depends on RISCV
> >> >diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> >> >index 7f8289790ed8..47995fdb2c60 100644
> >> >--- a/drivers/irqchip/Makefile
> >> >+++ b/drivers/irqchip/Makefile
> >> >@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> >> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> >> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> >> > obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
> >> >+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
> >> > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> >> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> >> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> >> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
> >> >index 87450708a733..d1b342b66551 100644
> >> >--- a/drivers/irqchip/irq-riscv-aplic-main.c
> >> >+++ b/drivers/irqchip/irq-riscv-aplic-main.c
> >> >@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
> >> > msi_mode = of_property_present(to_of_node(dev->fwnode),
> >> > "msi-parent");
> >> > if (msi_mode)
> >> >- rc = -ENODEV;
> >> >+ rc = aplic_msi_setup(dev, regs);
> >> > else
> >> > rc = aplic_direct_setup(dev, regs);
> >> > if (rc) {
> >> >diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
> >> >index 474a04229334..78267ec58098 100644
> >> >--- a/drivers/irqchip/irq-riscv-aplic-main.h
> >> >+++ b/drivers/irqchip/irq-riscv-aplic-main.h
> >> >@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
> >> > int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
> >> > void __iomem *regs);
> >> > int aplic_direct_setup(struct device *dev, void __iomem *regs);
> >> >+#ifdef CONFIG_RISCV_APLIC_MSI
> >> >+int aplic_msi_setup(struct device *dev, void __iomem *regs);
> >> >+#else
> >> >+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
> >> >+{
> >> >+ return -ENODEV;
> >> >+}
> >> >+#endif
> >> >
> >> > #endif
> >> >diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
> >> >new file mode 100644
> >> >index 000000000000..086d00e0429e
> >> >--- /dev/null
> >> >+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
> >> >@@ -0,0 +1,285 @@
> >> >+// SPDX-License-Identifier: GPL-2.0
> >> >+/*
> >> >+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> >> >+ * Copyright (C) 2022 Ventana Micro Systems Inc.
> >> >+ */
> >> >+
> >> >+#include <linux/bitops.h>
> >> >+#include <linux/cpu.h>
> >> >+#include <linux/interrupt.h>
> >> >+#include <linux/irqchip.h>
> >> >+#include <linux/irqchip/riscv-aplic.h>
> >> >+#include <linux/irqchip/riscv-imsic.h>
> >> >+#include <linux/module.h>
> >> >+#include <linux/msi.h>
> >> >+#include <linux/of_irq.h>
> >> >+#include <linux/platform_device.h>
> >> >+#include <linux/printk.h>
> >> >+#include <linux/smp.h>
> >> >+
> >> >+#include "irq-riscv-aplic-main.h"
> >> >+
> >> >+static void aplic_msi_irq_unmask(struct irq_data *d)
> >> >+{
> >> >+ aplic_irq_unmask(d);
> >> >+ irq_chip_unmask_parent(d);
> >> >+}
> >> >+
> >> >+static void aplic_msi_irq_mask(struct irq_data *d)
> >> >+{
> >> >+ aplic_irq_mask(d);
> >> >+ irq_chip_mask_parent(d);
> >> >+}
> >> >+
> >> >+static void aplic_msi_irq_eoi(struct irq_data *d)
> >> >+{
> >> >+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> >> >+ u32 reg_off, reg_mask;
> >> >+
> >> >+ /*
> >> >+ * EOI handling only required only for level-triggered
> >> >+ * interrupts in APLIC MSI mode.
> >> >+ */
> >> >+
> >> >+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> >> >+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> >> >+ switch (irqd_get_trigger_type(d)) {
> >> >+ case IRQ_TYPE_LEVEL_LOW:
> >> >+ if (!(readl(priv->regs + reg_off) & reg_mask))
> >> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> >> >+ break;
> >> >+ case IRQ_TYPE_LEVEL_HIGH:
> >> >+ if (readl(priv->regs + reg_off) & reg_mask)
> >> >+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> >> >+ break;
> >> >+ }
> >> >+}
> >> >+
> >> >+static struct irq_chip aplic_msi_chip = {
> >> >+ .name = "APLIC-MSI",
> >> >+ .irq_mask = aplic_msi_irq_mask,
> >> >+ .irq_unmask = aplic_msi_irq_unmask,
> >> >+ .irq_set_type = aplic_irq_set_type,
> >> >+ .irq_eoi = aplic_msi_irq_eoi,
> >> >+#ifdef CONFIG_SMP
> >> >+ .irq_set_affinity = irq_chip_set_affinity_parent,
> >> >+#endif
> >> >+ .flags = IRQCHIP_SET_TYPE_MASKED |
> >> >+ IRQCHIP_SKIP_SET_WAKE |
> >> >+ IRQCHIP_MASK_ON_SUSPEND,
> >> >+};
> >> >+
> >> >+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
> >> >+ struct irq_fwspec *fwspec,
> >> >+ unsigned long *hwirq,
> >> >+ unsigned int *type)
> >> >+{
> >> >+ struct aplic_priv *priv = platform_msi_get_host_data(d);
> >> >+
> >> >+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> >> >+}
> >> >+
> >> >+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
> >> >+ unsigned int virq, unsigned int nr_irqs,
> >> >+ void *arg)
> >> >+{
> >> >+ int i, ret;
> >> >+ unsigned int type;
> >> >+ irq_hw_number_t hwirq;
> >> >+ struct irq_fwspec *fwspec = arg;
> >> >+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
> >> >+
> >> >+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> >> >+ if (ret)
> >> >+ return ret;
> >> >+
> >> >+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> >> >+ if (ret)
> >> >+ return ret;
> >> >+
> >> >+ for (i = 0; i < nr_irqs; i++) {
> >> >+ irq_domain_set_info(domain, virq + i, hwirq + i,
> >> >+ &aplic_msi_chip, priv, handle_fasteoi_irq,
> >> >+ NULL, NULL);
> >> >+ /*
> >> >+ * APLIC does not implement irq_disable() so Linux interrupt
> >> >+ * subsystem will take a lazy approach for disabling an APLIC
> >> >+ * interrupt. This means APLIC interrupts are left unmasked
> >> >+ * upon system suspend and interrupts are not processed
> >> >+ * immediately upon system wake up. To tackle this, we disable
> >> >+ * the lazy approach for all APLIC interrupts.
> >> >+ */
> >> >+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> >> >+ }
> >>
> >> For platfrom MSI irq, it will call irq_domain_set_info() and irq_set_status_flags() twice, the first one is here:
> >> platform_msi_device_domain_alloc->msi_domain_populate_irqs->irq_domain_alloc_irqs_hierarchy->imsic_irq_domain_alloc->irq_domain_set_info
> >>
> >> so i think here this for(...) is not necessary, can be removed.
> >
> >If we remove then it breaks APLIC MSI-mode because we have
> >hierarchical irq domains where the APLIC-MSI domain is a child
> >of the IMSIC-PLAT domain.
> >
> >The irq_domain_set_info() called by IMSIC driver only sets irqchip
> >for IMSIC irq whereas irq_domain_set_info() called by APLIC driver
> >sets irqchip for APLIC irq. We use a different APLIC irqchip for the
> >APLIC domain to mask, unmask, and eoi irqs in an APLIC specific
> >way.
> >
>
> As your said APLIC-MSI domain is a child of the IMSIC-PLAT domain, so all of platform IRQ or wired IRQ will go to APLIC-MSI domain firstly.
> how about the pure MSI interrupt? for example the MSI of PCIe device or device driver call platform_msi_domain_alloc_irqs() to allocate a MSI ?

MSIs from PCIe device will directly go to IMSIC-PCI domain.

> in this scenario, it also go into APLIC-MSI domain firstly?

No

>
> would you like provide the steps how to test the PCI MSI for your patchset on QEMU? i run a QEMU system, but i cannot found any PCI devices using MSI, especially the virtio devices which using the platform IRQ.

Just add virtio-blk-pci OR some other PCI device in your QEMU
command line but ensure that you have corresponding device
driver enabled in your kernel.

>
> ~# cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 10: 38972 38946 38882 38924 RISC-V INTC 5 Edge riscv-timer
> 11: 0 1149 0 0 APLIC-MSI 8 Level virtio0
> 12: 0 0 21 0 APLIC-MSI 7 Level virtio1
> 13: 149 0 0 218 APLIC-MSI 10 Level ttyS0
> IPI0: 40 53 43 50 Rescheduling interrupts
> IPI1: 7518 8899 6679 7959 Function call interrupts
> IPI2: 0 0 0 0 CPU stop interrupts
> IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
> IPI4: 0 0 0 0 IRQ work interrupts
> IPI5: 0 0 0 0 Timer broadcast interrupts
>
>

Regards,
Anup

2023-11-04 01:01:24

by Ben

[permalink] [raw]
Subject: Re:[PATCH v11 12/14] irqchip/riscv-aplic: Add support for MSI-mode

At 2023-10-24 01:27:58, "Anup Patel" <[email protected]> wrote:
>The RISC-V advanced platform-level interrupt controller (APLIC) has
>two modes of operation: 1) Direct mode and 2) MSI mode.
>(For more details, refer https://github.com/riscv/riscv-aia)
>
>In APLIC MSI-mode, wired interrupts are forwared as message signaled
>interrupts (MSIs) to CPUs via IMSIC.
>
>We extend the existing APLIC irqchip driver to support MSI-mode for
>RISC-V platforms having both wired interrupts and MSIs.
>
>Signed-off-by: Anup Patel <[email protected]>
>---
> drivers/irqchip/Kconfig | 6 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-aplic-main.c | 2 +-
> drivers/irqchip/irq-riscv-aplic-main.h | 8 +
> drivers/irqchip/irq-riscv-aplic-msi.c | 285 +++++++++++++++++++++++++
> 5 files changed, 301 insertions(+), 1 deletion(-)
> create mode 100644 drivers/irqchip/irq-riscv-aplic-msi.c
>
>diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
>index 1996cc6f666a..7adc4dbe07ff 100644
>--- a/drivers/irqchip/Kconfig
>+++ b/drivers/irqchip/Kconfig
>@@ -551,6 +551,12 @@ config RISCV_APLIC
> depends on RISCV
> select IRQ_DOMAIN_HIERARCHY
>
>+config RISCV_APLIC_MSI
>+ bool
>+ depends on RISCV_APLIC
>+ select GENERIC_MSI_IRQ
>+ default RISCV_APLIC
>+
> config RISCV_IMSIC
> bool
> depends on RISCV
>diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
>index 7f8289790ed8..47995fdb2c60 100644
>--- a/drivers/irqchip/Makefile
>+++ b/drivers/irqchip/Makefile
>@@ -96,6 +96,7 @@ obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic-main.o irq-riscv-aplic-direct.o
>+obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
> obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
>diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
>index 87450708a733..d1b342b66551 100644
>--- a/drivers/irqchip/irq-riscv-aplic-main.c
>+++ b/drivers/irqchip/irq-riscv-aplic-main.c
>@@ -205,7 +205,7 @@ static int aplic_probe(struct platform_device *pdev)
> msi_mode = of_property_present(to_of_node(dev->fwnode),
> "msi-parent");
> if (msi_mode)
>- rc = -ENODEV;
>+ rc = aplic_msi_setup(dev, regs);
> else
> rc = aplic_direct_setup(dev, regs);
> if (rc) {
>diff --git a/drivers/irqchip/irq-riscv-aplic-main.h b/drivers/irqchip/irq-riscv-aplic-main.h
>index 474a04229334..78267ec58098 100644
>--- a/drivers/irqchip/irq-riscv-aplic-main.h
>+++ b/drivers/irqchip/irq-riscv-aplic-main.h
>@@ -41,5 +41,13 @@ void aplic_init_hw_global(struct aplic_priv *priv, bool msi_mode);
> int aplic_setup_priv(struct aplic_priv *priv, struct device *dev,
> void __iomem *regs);
> int aplic_direct_setup(struct device *dev, void __iomem *regs);
>+#ifdef CONFIG_RISCV_APLIC_MSI
>+int aplic_msi_setup(struct device *dev, void __iomem *regs);
>+#else
>+static inline int aplic_msi_setup(struct device *dev, void __iomem *regs)
>+{
>+ return -ENODEV;
>+}
>+#endif
>
> #endif
>diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c
>new file mode 100644
>index 000000000000..086d00e0429e
>--- /dev/null
>+++ b/drivers/irqchip/irq-riscv-aplic-msi.c
>@@ -0,0 +1,285 @@
>+// SPDX-License-Identifier: GPL-2.0
>+/*
>+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
>+ * Copyright (C) 2022 Ventana Micro Systems Inc.
>+ */
>+
>+#include <linux/bitops.h>
>+#include <linux/cpu.h>
>+#include <linux/interrupt.h>
>+#include <linux/irqchip.h>
>+#include <linux/irqchip/riscv-aplic.h>
>+#include <linux/irqchip/riscv-imsic.h>
>+#include <linux/module.h>
>+#include <linux/msi.h>
>+#include <linux/of_irq.h>
>+#include <linux/platform_device.h>
>+#include <linux/printk.h>
>+#include <linux/smp.h>
>+
>+#include "irq-riscv-aplic-main.h"
>+
>+static void aplic_msi_irq_unmask(struct irq_data *d)
>+{
>+ aplic_irq_unmask(d);
>+ irq_chip_unmask_parent(d);
>+}
>+
>+static void aplic_msi_irq_mask(struct irq_data *d)
>+{
>+ aplic_irq_mask(d);
>+ irq_chip_mask_parent(d);
>+}
>+
>+static void aplic_msi_irq_eoi(struct irq_data *d)
>+{
>+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
>+ u32 reg_off, reg_mask;
>+
>+ /*
>+ * EOI handling only required only for level-triggered
>+ * interrupts in APLIC MSI mode.
>+ */
>+
>+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
>+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
>+ switch (irqd_get_trigger_type(d)) {
>+ case IRQ_TYPE_LEVEL_LOW:
>+ if (!(readl(priv->regs + reg_off) & reg_mask))
>+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>+ break;
>+ case IRQ_TYPE_LEVEL_HIGH:
>+ if (readl(priv->regs + reg_off) & reg_mask)
>+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
>+ break;
>+ }
>+}
>+
>+static struct irq_chip aplic_msi_chip = {
>+ .name = "APLIC-MSI",
>+ .irq_mask = aplic_msi_irq_mask,
>+ .irq_unmask = aplic_msi_irq_unmask,
>+ .irq_set_type = aplic_irq_set_type,
>+ .irq_eoi = aplic_msi_irq_eoi,
>+#ifdef CONFIG_SMP
>+ .irq_set_affinity = irq_chip_set_affinity_parent,
>+#endif
>+ .flags = IRQCHIP_SET_TYPE_MASKED |
>+ IRQCHIP_SKIP_SET_WAKE |
>+ IRQCHIP_MASK_ON_SUSPEND,
>+};
>+
>+static int aplic_msi_irqdomain_translate(struct irq_domain *d,
>+ struct irq_fwspec *fwspec,
>+ unsigned long *hwirq,
>+ unsigned int *type)
>+{
>+ struct aplic_priv *priv = platform_msi_get_host_data(d);
>+
>+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
>+}
>+
>+static int aplic_msi_irqdomain_alloc(struct irq_domain *domain,
>+ unsigned int virq, unsigned int nr_irqs,
>+ void *arg)
>+{
>+ int i, ret;
>+ unsigned int type;
>+ irq_hw_number_t hwirq;
>+ struct irq_fwspec *fwspec = arg;
>+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
>+
>+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
>+ if (ret)
>+ return ret;

In your patchset, the wired IRQ and IRQ of platform device will go into APLIC-MSI domain firstly.
Let me assume here is a MSI IRQ not wired IRQ on a device, and it is a platform device in system.
so in aplic_irqdomain_translate() function, it will parse the APLIC physical IRQ number by fwspec->param[0],
but this is not a wried IRQ, it is a MSI IRQ, it should not has a APLIC physical IRQ number, the hwirq number should be allocated by MSI bitmap,
what value will be parse by DTS? zero or negative?

if this is a nonexistent physical IRQ number for APLIC, in aplic_msi_irq_unmask()->aplic_irq_unmask(), how it works?

writel(d->hwirq, priv->regs + APLIC_SETIENUM);