2023-06-13 15:48:54

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 00/10] Linux RISC-V AIA Support

The RISC-V AIA specification is now frozen as-per the RISC-V international
process. The latest frozen specifcation can be found at:
https://github.com/riscv/riscv-aia/releases/download/1.0-RC1/riscv-interrupts-1.0-RC1.pdf

At a high-level, the AIA specification adds three things:
1) AIA CSRs
- Improved local interrupt support
2) Incoming Message Signaled Interrupt Controller (IMSIC)
- Per-HART MSI controller
- Support MSI virtualization
- Support IPI along with virtualization
3) Advanced Platform-Level Interrupt Controller (APLIC)
- Wired interrupt controller
- In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
- In Direct-mode, injects external interrupts directly into HARTs

For an overview of the AIA specification, refer the recent AIA virtualization
talk at KVM Forum 2022:
https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
https://www.youtube.com/watch?v=r071dL8Z0yo

The PATCH2 of this series conflicts with the "irqchip/riscv-intc: Add ACPI
support" patch of the "Add basic ACPI support for RISC-V" series hence this
series is based upon the "Add basic ACPI support for RISC-V" series.
(Refer, https://lore.kernel.org/lkml/[email protected]/)

To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).

These patches can also be found in the riscv_aia_v4 branch at:
https://github.com/avpatel/linux.git

Changes since v3:
- Rebased on Linux-6.4-rc6
- Droped PATCH2 of v3 series instead we now set FWNODE_FLAG_BEST_EFFORT via
IRQCHIP_DECLARE()
- Extend riscv_fw_parent_hartid() to support both DT and ACPI in PATCH1
- Extend iommu_dma_compose_msi_msg() instead of adding iommu_dma_select_msi()
in PATCH6
- Addressed Conor's comments in PATCH3
- Addressed Conor's and Rob's comments in PATCH7

Changes since v2:
- Rebased on Linux-6.4-rc1
- Addressed Rob's comments on DT bindings patches 4 and 8.
- Addessed Marc's comments on IMSIC driver PATCH5
- Replaced use of OF apis in APLIC and IMSIC drivers with FWNODE apis
this makes both drivers easily portable for ACPI support. This also
removes unnecessary indirection from the APLIC and IMSIC drivers.
- PATCH1 is a new patch for portability with ACPI support
- PATCH2 is a new patch to fix probing in APLIC drivers for APLIC-only systems.
- PATCH7 is a new patch which addresses the IOMMU DMA domain issues pointed
out by SiFive

Changes since v1:
- Rebased on Linux-6.2-rc2
- Addressed comments on IMSIC DT bindings for PATCH4
- Use raw_spin_lock_irqsave() on ids_lock for PATCH5
- Improved MMIO alignment checks in PATCH5 to allow MMIO regions
with holes.
- Addressed comments on APLIC DT bindings for PATCH6
- Fixed warning splat in aplic_msi_write_msg() caused by
zeroed MSI message in PATCH7
- Dropped DT property riscv,slow-ipi instead will have module
parameter in future.

Anup Patel (10):
RISC-V: Add riscv_fw_parent_hartid() function
irqchip/riscv-intc: Add support for RISC-V AIA
dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller
irqchip: Add RISC-V incoming MSI controller driver
irqchip/riscv-imsic: Add support for PCI MSI irqdomain
irqchip/riscv-imsic: Improve IOMMU DMA support
dt-bindings: interrupt-controller: Add RISC-V advanced PLIC
irqchip: Add RISC-V advanced PLIC driver
RISC-V: Select APLIC and IMSIC drivers
MAINTAINERS: Add entry for RISC-V AIA drivers

.../interrupt-controller/riscv,aplic.yaml | 169 +++
.../interrupt-controller/riscv,imsics.yaml | 172 +++
MAINTAINERS | 12 +
arch/riscv/Kconfig | 2 +
arch/riscv/include/asm/processor.h | 3 +
arch/riscv/kernel/cpu.c | 16 +
drivers/iommu/dma-iommu.c | 24 +-
drivers/irqchip/Kconfig | 20 +-
drivers/irqchip/Makefile | 2 +
drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++
drivers/irqchip/irq-riscv-imsic.c | 1076 +++++++++++++++++
drivers/irqchip/irq-riscv-intc.c | 36 +-
include/linux/irqchip/riscv-aplic.h | 119 ++
include/linux/irqchip/riscv-imsic.h | 86 ++
14 files changed, 2492 insertions(+), 10 deletions(-)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
create mode 100644 drivers/irqchip/irq-riscv-aplic.c
create mode 100644 drivers/irqchip/irq-riscv-imsic.c
create mode 100644 include/linux/irqchip/riscv-aplic.h
create mode 100644 include/linux/irqchip/riscv-imsic.h

--
2.34.1



2023-06-13 15:53:26

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 03/10] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller

We add DT bindings document for the RISC-V incoming MSI controller
(IMSIC) defined by the RISC-V advanced interrupt architecture (AIA)
specification.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
---
.../interrupt-controller/riscv,imsics.yaml | 172 ++++++++++++++++++
1 file changed, 172 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
new file mode 100644
index 000000000000..84976f17a4a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
@@ -0,0 +1,172 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Incoming MSI Controller (IMSIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description: |
+ The RISC-V advanced interrupt architecture (AIA) defines a per-CPU incoming
+ MSI controller (IMSIC) for handling MSIs in a RISC-V platform. The RISC-V
+ AIA specification can be found at https://github.com/riscv/riscv-aia.
+
+ The IMSIC is a per-CPU (or per-HART) device with separate interrupt file
+ for each privilege level (machine or supervisor). The configuration of
+ a IMSIC interrupt file is done using AIA CSRs and it also has a 4KB MMIO
+ space to receive MSIs from devices. Each IMSIC interrupt file supports a
+ fixed number of interrupt identities (to distinguish MSIs from devices)
+ which is same for given privilege level across CPUs (or HARTs).
+
+ The device tree of a RISC-V platform will have one IMSIC device tree node
+ for each privilege level (machine or supervisor) which collectively describe
+ IMSIC interrupt files at that privilege level across CPUs (or HARTs).
+
+ The arrangement of IMSIC interrupt files in MMIO space of a RISC-V platform
+ follows a particular scheme defined by the RISC-V AIA specification. A IMSIC
+ group is a set of IMSIC interrupt files co-located in MMIO space and we can
+ have multiple IMSIC groups (i.e. clusters, sockets, chiplets, etc) in a
+ RISC-V platform. The MSI target address of a IMSIC interrupt file at given
+ privilege level (machine or supervisor) encodes group index, HART index,
+ and guest index (shown below).
+
+ XLEN-1 > (HART Index MSB) 12 0
+ | | | |
+ -------------------------------------------------------------
+ |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ -------------------------------------------------------------
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+ - $ref: /schemas/interrupt-controller/msi-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,imsics
+ - const: riscv,imsics
+
+ reg:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Base address of each IMSIC group.
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 0
+
+ msi-controller: true
+
+ "#msi-cells":
+ const: 0
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ This property represents the set of CPUs (or HARTs) for which given
+ device tree node describes the IMSIC interrupt files. Each node pointed
+ to should be a riscv,cpu-intc node, which has a CPU node (i.e. RISC-V
+ HART) as parent.
+
+ riscv,num-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities supported by IMSIC interrupt file.
+
+ riscv,num-guest-ids:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 63
+ maximum: 2047
+ description:
+ Number of interrupt identities are supported by IMSIC guest interrupt
+ file. When not specified it is assumed to be same as specified by the
+ riscv,num-ids property.
+
+ riscv,guest-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of guest index bits in the MSI target address.
+
+ riscv,hart-index-bits:
+ minimum: 0
+ maximum: 15
+ description:
+ Number of HART index bits in the MSI target address. When not
+ specified it is calculated based on the interrupts-extended property.
+
+ riscv,group-index-bits:
+ minimum: 0
+ maximum: 7
+ default: 0
+ description:
+ Number of group index bits in the MSI target address.
+
+ riscv,group-index-shift:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 0
+ maximum: 55
+ default: 24
+ description:
+ The least significant bit position of the group index bits in the
+ MSI target address.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - msi-controller
+ - "#msi-cells"
+ - interrupts-extended
+ - riscv,num-ids
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (Machine-level IMSIC files with just one group):
+
+ interrupt-controller@24000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0x28000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ };
+
+ - |
+ // Example 2 (Supervisor-level IMSIC files with two groups):
+
+ interrupt-controller@28000000 {
+ compatible = "qemu,imsics", "riscv,imsics";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>,
+ <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0x28000000 0x2000>, /* Group0 IMSICs */
+ <0x29000000 0x2000>; /* Group1 IMSICs */
+ interrupt-controller;
+ #interrupt-cells = <0>;
+ msi-controller;
+ #msi-cells = <0>;
+ riscv,num-ids = <127>;
+ riscv,group-index-bits = <1>;
+ riscv,group-index-shift = <24>;
+ };
+...
--
2.34.1


2023-06-13 15:53:42

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

The RISC-V advanced interrupt architecture (AIA) specification defines
a new interrupt controller for managing wired interrupts on a RISC-V
platform. This new interrupt controller is referred to as advanced
platform-level interrupt controller (APLIC) which can forward wired
interrupts to CPUs (or HARTs) as local interrupts OR as message
signaled interrupts.
(For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
platforms.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 6 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
include/linux/irqchip/riscv-aplic.h | 119 +++++
4 files changed, 891 insertions(+)
create mode 100644 drivers/irqchip/irq-riscv-aplic.c
create mode 100644 include/linux/irqchip/riscv-aplic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index d700980372ef..834c0329f583 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -544,6 +544,12 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_APLIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_MSI_IRQ
+
config RISCV_IMSIC
bool
depends on RISCV
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 577bde3e986b..438b8e1a152c 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
new file mode 100644
index 000000000000..1e710fdf5608
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -0,0 +1,765 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-aplic: " fmt
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-aplic.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#define APLIC_DEFAULT_PRIORITY 1
+#define APLIC_DISABLE_IDELIVERY 0
+#define APLIC_ENABLE_IDELIVERY 1
+#define APLIC_DISABLE_ITHRESHOLD 1
+#define APLIC_ENABLE_ITHRESHOLD 0
+
+struct aplic_msicfg {
+ phys_addr_t base_ppn;
+ u32 hhxs;
+ u32 hhxw;
+ u32 lhxs;
+ u32 lhxw;
+};
+
+struct aplic_idc {
+ unsigned int hart_index;
+ void __iomem *regs;
+ struct aplic_priv *priv;
+};
+
+struct aplic_priv {
+ struct fwnode_handle *fwnode;
+ u32 gsi_base;
+ u32 nr_irqs;
+ u32 nr_idcs;
+ void __iomem *regs;
+ struct irq_domain *irqdomain;
+ struct aplic_msicfg msicfg;
+ struct cpumask lmask;
+};
+
+static unsigned int aplic_idc_parent_irq;
+static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
+
+static void aplic_irq_unmask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_SETIENUM);
+
+ if (!priv->nr_idcs)
+ irq_chip_unmask_parent(d);
+}
+
+static void aplic_irq_mask(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
+
+ if (!priv->nr_idcs)
+ irq_chip_mask_parent(d);
+}
+
+static int aplic_set_type(struct irq_data *d, unsigned int type)
+{
+ u32 val = 0;
+ void __iomem *sourcecfg;
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+
+ switch (type) {
+ case IRQ_TYPE_NONE:
+ val = APLIC_SOURCECFG_SM_INACTIVE;
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ val = APLIC_SOURCECFG_SM_LEVEL_LOW;
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ val = APLIC_SOURCECFG_SM_EDGE_FALL;
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ val = APLIC_SOURCECFG_SM_EDGE_RISE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
+ sourcecfg += (d->hwirq - 1) * sizeof(u32);
+ writel(val, sourcecfg);
+
+ return 0;
+}
+
+static void aplic_irq_eoi(struct irq_data *d)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ u32 reg_off, reg_mask;
+
+ /*
+ * EOI handling only required only for level-triggered
+ * interrupts in APLIC MSI mode.
+ */
+
+ if (priv->nr_idcs)
+ return;
+
+ reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
+ reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
+ switch (irqd_get_trigger_type(d)) {
+ case IRQ_TYPE_LEVEL_LOW:
+ if (!(readl(priv->regs + reg_off) & reg_mask))
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ if (readl(priv->regs + reg_off) & reg_mask)
+ writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
+ break;
+ }
+}
+
+#ifdef CONFIG_SMP
+static int aplic_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val, bool force)
+{
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_idc *idc;
+ unsigned int cpu, val;
+ struct cpumask amask;
+ void __iomem *target;
+
+ if (!priv->nr_idcs)
+ return irq_chip_set_affinity_parent(d, mask_val, force);
+
+ cpumask_and(&amask, &priv->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ writel(val, target);
+
+ irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+#endif
+
+static struct irq_chip aplic_chip = {
+ .name = "RISC-V APLIC",
+ .irq_mask = aplic_irq_mask,
+ .irq_unmask = aplic_irq_unmask,
+ .irq_set_type = aplic_set_type,
+ .irq_eoi = aplic_irq_eoi,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = aplic_set_affinity,
+#endif
+ .flags = IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
+ u32 gsi_base,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ if (WARN_ON(fwspec->param_count < 2))
+ return -EINVAL;
+ if (WARN_ON(!fwspec->param[0]))
+ return -EINVAL;
+
+ /* For DT, gsi_base is always zero. */
+ *hwirq = fwspec->param[0] - gsi_base;
+ *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+
+ WARN_ON(*type == IRQ_TYPE_NONE);
+
+ return 0;
+}
+
+static int aplic_irqdomain_msi_translate(struct irq_domain *d,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ struct aplic_priv *priv = platform_msi_get_host_data(d);
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = platform_msi_get_host_data(domain);
+
+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &aplic_chip, priv, handle_fasteoi_irq,
+ NULL, NULL);
+ /*
+ * APLIC does not implement irq_disable() so Linux interrupt
+ * subsystem will take a lazy approach for disabling an APLIC
+ * interrupt. This means APLIC interrupts are left unmasked
+ * upon system suspend and interrupts are not processed
+ * immediately upon system wake up. To tackle this, we disable
+ * the lazy approach for all APLIC interrupts.
+ */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
+ .translate = aplic_irqdomain_msi_translate,
+ .alloc = aplic_irqdomain_msi_alloc,
+ .free = platform_msi_device_domain_free,
+};
+
+static int aplic_irqdomain_idc_translate(struct irq_domain *d,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ struct aplic_priv *priv = d->host_data;
+
+ return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
+}
+
+static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs,
+ void *arg)
+{
+ int i, ret;
+ unsigned int type;
+ irq_hw_number_t hwirq;
+ struct irq_fwspec *fwspec = arg;
+ struct aplic_priv *priv = domain->host_data;
+
+ ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++) {
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &aplic_chip, priv, handle_fasteoi_irq,
+ NULL, NULL);
+ irq_set_affinity(virq + i, &priv->lmask);
+ /* See the reason described in aplic_irqdomain_msi_alloc() */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
+ .translate = aplic_irqdomain_idc_translate,
+ .alloc = aplic_irqdomain_idc_alloc,
+ .free = irq_domain_free_irqs_top,
+};
+
+static void aplic_init_hw_irqs(struct aplic_priv *priv)
+{
+ int i;
+
+ /* Disable all interrupts */
+ for (i = 0; i <= priv->nr_irqs; i += 32)
+ writel(-1U, priv->regs + APLIC_CLRIE_BASE +
+ (i / 32) * sizeof(u32));
+
+ /* Set interrupt type and default priority for all interrupts */
+ for (i = 1; i <= priv->nr_irqs; i++) {
+ writel(0, priv->regs + APLIC_SOURCECFG_BASE +
+ (i - 1) * sizeof(u32));
+ writel(APLIC_DEFAULT_PRIORITY,
+ priv->regs + APLIC_TARGET_BASE +
+ (i - 1) * sizeof(u32));
+ }
+
+ /* Clear APLIC domaincfg */
+ writel(0, priv->regs + APLIC_DOMAINCFG);
+}
+
+static void aplic_init_hw_global(struct aplic_priv *priv)
+{
+ u32 val;
+#ifdef CONFIG_RISCV_M_MODE
+ u32 valH;
+
+ if (!priv->nr_idcs) {
+ val = priv->msicfg.base_ppn;
+ valH = (priv->msicfg.base_ppn >> 32) &
+ APLIC_xMSICFGADDRH_BAPPN_MASK;
+ valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
+ << APLIC_xMSICFGADDRH_LHXW_SHIFT;
+ valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
+ << APLIC_xMSICFGADDRH_HHXW_SHIFT;
+ valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
+ << APLIC_xMSICFGADDRH_LHXS_SHIFT;
+ valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
+ << APLIC_xMSICFGADDRH_HHXS_SHIFT;
+ writel(val, priv->regs + APLIC_xMSICFGADDR);
+ writel(valH, priv->regs + APLIC_xMSICFGADDRH);
+ }
+#endif
+
+ /* Setup APLIC domaincfg register */
+ val = readl(priv->regs + APLIC_DOMAINCFG);
+ val |= APLIC_DOMAINCFG_IE;
+ if (!priv->nr_idcs)
+ val |= APLIC_DOMAINCFG_DM;
+ writel(val, priv->regs + APLIC_DOMAINCFG);
+ if (readl(priv->regs + APLIC_DOMAINCFG) != val)
+ pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
+ priv->fwnode, val);
+}
+
+static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+ unsigned int group_index, hart_index, guest_index, val;
+ struct irq_data *d = irq_get_irq_data(desc->irq);
+ struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
+ struct aplic_msicfg *mc = &priv->msicfg;
+ phys_addr_t tppn, tbppn, msg_addr;
+ void __iomem *target;
+
+ /* For zeroed MSI, simply write zero into the target register */
+ if (!msg->address_hi && !msg->address_lo && !msg->data) {
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ writel(0, target);
+ return;
+ }
+
+ /* Sanity check on message data */
+ WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
+
+ /* Compute target MSI address */
+ msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
+ tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+
+ /* Compute target HART Base PPN */
+ tbppn = tppn;
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+ WARN_ON(tbppn != mc->base_ppn);
+
+ /* Compute target group and hart indexes */
+ group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
+ APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
+ hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
+ APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
+ hart_index |= (group_index << mc->lhxw);
+ WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
+
+ /* Compute target guest index */
+ guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
+
+ /* Update IRQ TARGET register */
+ target = priv->regs + APLIC_TARGET_BASE;
+ target += (d->hwirq - 1) * sizeof(u32);
+ val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
+ << APLIC_TARGET_HART_IDX_SHIFT;
+ val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
+ << APLIC_TARGET_GUEST_IDX_SHIFT;
+ val |= (msg->data & APLIC_TARGET_EIID_MASK);
+ writel(val, target);
+}
+
+static int aplic_setup_msi(struct aplic_priv *priv)
+{
+ struct aplic_msicfg *mc = &priv->msicfg;
+ const struct imsic_global_config *imsic_global;
+
+ /*
+ * The APLIC outgoing MSI config registers assume target MSI
+ * controller to be RISC-V AIA IMSIC controller.
+ */
+ imsic_global = imsic_get_global_config();
+ if (!imsic_global) {
+ pr_err("%pfwP: IMSIC global config not found\n",
+ priv->fwnode);
+ return -ENODEV;
+ }
+
+ /* Find number of guest index bits (LHXS) */
+ mc->lhxs = imsic_global->guest_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
+ pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
+ priv->fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of HART index bits (LHXW) */
+ mc->lhxw = imsic_global->hart_index_bits;
+ if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
+ pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
+ priv->fwnode);
+ return -EINVAL;
+ }
+
+ /* Find number of group index bits (HHXW) */
+ mc->hhxw = imsic_global->group_index_bits;
+ if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
+ pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
+ priv->fwnode);
+ return -EINVAL;
+ }
+
+ /* Find first bit position of group index (HHXS) */
+ mc->hhxs = imsic_global->group_index_shift;
+ if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
+ pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
+ priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
+ return -EINVAL;
+ }
+ mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
+ if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
+ pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
+ priv->fwnode);
+ return -EINVAL;
+ }
+
+ /* Compute PPN base */
+ mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
+ mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
+
+ /* Use all possible CPUs as lmask */
+ cpumask_copy(&priv->lmask, cpu_possible_mask);
+
+ return 0;
+}
+
+/*
+ * To handle an APLIC IDC interrupts, we just read the CLAIMI register
+ * which will return highest priority pending interrupt and clear the
+ * pending bit of the interrupt. This process is repeated until CLAIMI
+ * register return zero value.
+ */
+static void aplic_idc_handle_irq(struct irq_desc *desc)
+{
+ struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ irq_hw_number_t hw_irq;
+ int irq;
+
+ chained_irq_enter(chip, desc);
+
+ while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
+ hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
+ irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
+
+ if (unlikely(irq <= 0))
+ pr_warn_ratelimited("hw_irq %lu mapping not found\n",
+ hw_irq);
+ else
+ generic_handle_irq(irq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
+{
+ u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
+ u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
+
+ /* Priority must be less than threshold for interrupt triggering */
+ writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
+
+ /* Delivery must be set to 1 for interrupt triggering */
+ writel(de, idc->regs + APLIC_IDC_IDELIVERY);
+}
+
+static int aplic_idc_dying_cpu(unsigned int cpu)
+{
+ if (aplic_idc_parent_irq)
+ disable_percpu_irq(aplic_idc_parent_irq);
+
+ return 0;
+}
+
+static int aplic_idc_starting_cpu(unsigned int cpu)
+{
+ if (aplic_idc_parent_irq)
+ enable_percpu_irq(aplic_idc_parent_irq,
+ irq_get_trigger_type(aplic_idc_parent_irq));
+
+ return 0;
+}
+
+static int aplic_setup_idc(struct aplic_priv *priv)
+{
+ int i, j, rc, cpu, setup_count = 0;
+ struct fwnode_reference_args parent;
+ struct irq_domain *domain;
+ unsigned long hartid;
+ struct aplic_idc *idc;
+ u32 val;
+
+ /* Setup per-CPU IDC and target CPU mask */
+ for (i = 0; i < priv->nr_idcs; i++) {
+ rc = fwnode_property_get_reference_args(priv->fwnode,
+ "interrupts-extended", "#interrupt-cells",
+ 0, i, &parent);
+ if (rc) {
+ pr_warn("%pfwP: parent irq for IDC%d not found\n",
+ priv->fwnode, i);
+ continue;
+ }
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (parent.args[0] != RV_IRQ_EXT)
+ continue;
+
+ rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
+ if (rc) {
+ pr_warn("%pfwP: invalid hartid for IDC%d\n",
+ priv->fwnode, i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("%pfwP: invalid cpuid for IDC%d\n",
+ priv->fwnode, i);
+ continue;
+ }
+
+ cpumask_set_cpu(cpu, &priv->lmask);
+
+ idc = per_cpu_ptr(&aplic_idcs, cpu);
+ idc->hart_index = i;
+ idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
+ idc->priv = priv;
+
+ aplic_idc_set_delivery(idc, true);
+
+ /*
+ * Boot cpu might not have APLIC hart_index = 0 so check
+ * and update target registers of all interrupts.
+ */
+ if (cpu == smp_processor_id() && idc->hart_index) {
+ val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
+ val <<= APLIC_TARGET_HART_IDX_SHIFT;
+ val |= APLIC_DEFAULT_PRIORITY;
+ for (j = 1; j <= priv->nr_irqs; j++)
+ writel(val, priv->regs + APLIC_TARGET_BASE +
+ (j - 1) * sizeof(u32));
+ }
+
+ setup_count++;
+ }
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!aplic_idc_parent_irq && domain) {
+ aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (aplic_idc_parent_irq) {
+ irq_set_chained_handler(aplic_idc_parent_irq,
+ aplic_idc_handle_irq);
+
+ /*
+ * Setup CPUHP notifier to enable IDC parent
+ * interrupt on all CPUs
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/aplic:starting",
+ aplic_idc_starting_cpu,
+ aplic_idc_dying_cpu);
+ }
+ }
+
+ /* Fail if we were not able to setup IDC for any CPU */
+ return (setup_count) ? 0 : -ENODEV;
+}
+
+static int aplic_probe(struct platform_device *pdev)
+{
+ struct fwnode_handle *fwnode = pdev->dev.fwnode;
+ struct fwnode_reference_args parent;
+ struct aplic_priv *priv;
+ struct resource *res;
+ phys_addr_t pa;
+ int rc;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+ priv->fwnode = fwnode;
+
+ /* Map the MMIO registers */
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
+ return -EINVAL;
+ }
+ priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
+ if (!priv->regs) {
+ pr_err("%pfwP: failed map MMIO registers\n", fwnode);
+ return -ENOMEM;
+ }
+
+ /*
+ * Find out GSI base number
+ *
+ * Note: DT does not define "riscv,gsi-base" property so GSI
+ * base is always zero for DT.
+ */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
+ &priv->gsi_base, 1);
+ if (rc)
+ priv->gsi_base = 0;
+
+ /* Find out number of interrupt sources */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
+ &priv->nr_irqs, 1);
+ if (rc) {
+ pr_err("%pfwP: failed to get number of interrupt sources\n",
+ fwnode);
+ return rc;
+ }
+
+ /* Setup initial state APLIC interrupts */
+ aplic_init_hw_irqs(priv);
+
+ /*
+ * Find out number of IDCs based on parent interrupts
+ *
+ * If "msi-parent" property is present then we ignore the
+ * APLIC IDCs which forces the APLIC driver to use MSI mode.
+ */
+ if (!fwnode_property_present(fwnode, "msi-parent")) {
+ while (!fwnode_property_get_reference_args(fwnode,
+ "interrupts-extended", "#interrupt-cells",
+ 0, priv->nr_idcs, &parent))
+ priv->nr_idcs++;
+ }
+
+ /* Setup IDCs or MSIs based on number of IDCs */
+ if (priv->nr_idcs)
+ rc = aplic_setup_idc(priv);
+ else
+ rc = aplic_setup_msi(priv);
+ if (rc) {
+ pr_err("%pfwP: failed setup %s\n",
+ fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
+ return rc;
+ }
+
+ /* Setup global config and interrupt delivery */
+ aplic_init_hw_global(priv);
+
+ /* Create irq domain instance for the APLIC */
+ if (priv->nr_idcs)
+ priv->irqdomain = irq_domain_create_linear(
+ priv->fwnode,
+ priv->nr_irqs + 1,
+ &aplic_irqdomain_idc_ops,
+ priv);
+ else
+ priv->irqdomain = platform_msi_create_device_domain(
+ &pdev->dev,
+ priv->nr_irqs + 1,
+ aplic_msi_write_msg,
+ &aplic_irqdomain_msi_ops,
+ priv);
+ if (!priv->irqdomain) {
+ pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
+ return -ENOMEM;
+ }
+
+ /* Advertise the interrupt controller */
+ if (priv->nr_idcs) {
+ pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
+ priv->fwnode, priv->nr_irqs, priv->nr_idcs);
+ } else {
+ pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
+ pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
+ priv->fwnode, priv->nr_irqs, &pa);
+ }
+
+ return 0;
+}
+
+static const struct of_device_id aplic_match[] = {
+ { .compatible = "riscv,aplic" },
+ {}
+};
+
+static struct platform_driver aplic_driver = {
+ .driver = {
+ .name = "riscv-aplic",
+ .of_match_table = aplic_match,
+ },
+ .probe = aplic_probe,
+};
+builtin_platform_driver(aplic_driver);
+
+static int __init aplic_dt_init(struct device_node *node,
+ struct device_node *parent)
+{
+ /*
+ * The APLIC platform driver needs to be probed early
+ * so for device tree:
+ *
+ * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
+ * provides a hint to the device driver core to probe the
+ * platform driver early.
+ * 2) Clear the OF_POPULATED flag in device_node because
+ * of_irq_init() sets it which prevents creation of
+ * platform device.
+ */
+ node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
+ of_node_clear_flag(node, OF_POPULATED);
+ return 0;
+}
+IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
new file mode 100644
index 000000000000..97e198ea0109
--- /dev/null
+++ b/include/linux/irqchip/riscv-aplic.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
+#define __LINUX_IRQCHIP_RISCV_APLIC_H
+
+#include <linux/bitops.h>
+
+#define APLIC_MAX_IDC BIT(14)
+#define APLIC_MAX_SOURCE 1024
+
+#define APLIC_DOMAINCFG 0x0000
+#define APLIC_DOMAINCFG_RDONLY 0x80000000
+#define APLIC_DOMAINCFG_IE BIT(8)
+#define APLIC_DOMAINCFG_DM BIT(2)
+#define APLIC_DOMAINCFG_BE BIT(0)
+
+#define APLIC_SOURCECFG_BASE 0x0004
+#define APLIC_SOURCECFG_D BIT(10)
+#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
+#define APLIC_SOURCECFG_SM_MASK 0x00000007
+#define APLIC_SOURCECFG_SM_INACTIVE 0x0
+#define APLIC_SOURCECFG_SM_DETACH 0x1
+#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
+#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
+#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
+#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
+
+#define APLIC_MMSICFGADDR 0x1bc0
+#define APLIC_MMSICFGADDRH 0x1bc4
+#define APLIC_SMSICFGADDR 0x1bc8
+#define APLIC_SMSICFGADDRH 0x1bcc
+
+#ifdef CONFIG_RISCV_M_MODE
+#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
+#else
+#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
+#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
+#endif
+
+#define APLIC_xMSICFGADDRH_L BIT(31)
+#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
+#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
+#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
+#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
+#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
+#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
+#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
+#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
+#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
+
+#define APLIC_xMSICFGADDR_PPN_SHIFT 12
+
+#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
+ (BIT(__lhxs) - 1)
+
+#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
+ (BIT(__lhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
+ ((__lhxs))
+#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
+ (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
+ APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
+
+#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
+ (BIT(__hhxw) - 1)
+#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
+ ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
+#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
+ (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
+ APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
+
+#define APLIC_IRQBITS_PER_REG 32
+
+#define APLIC_SETIP_BASE 0x1c00
+#define APLIC_SETIPNUM 0x1cdc
+
+#define APLIC_CLRIP_BASE 0x1d00
+#define APLIC_CLRIPNUM 0x1ddc
+
+#define APLIC_SETIE_BASE 0x1e00
+#define APLIC_SETIENUM 0x1edc
+
+#define APLIC_CLRIE_BASE 0x1f00
+#define APLIC_CLRIENUM 0x1fdc
+
+#define APLIC_SETIPNUM_LE 0x2000
+#define APLIC_SETIPNUM_BE 0x2004
+
+#define APLIC_GENMSI 0x3000
+
+#define APLIC_TARGET_BASE 0x3004
+#define APLIC_TARGET_HART_IDX_SHIFT 18
+#define APLIC_TARGET_HART_IDX_MASK 0x3fff
+#define APLIC_TARGET_GUEST_IDX_SHIFT 12
+#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
+#define APLIC_TARGET_IPRIO_MASK 0xff
+#define APLIC_TARGET_EIID_MASK 0x7ff
+
+#define APLIC_IDC_BASE 0x4000
+#define APLIC_IDC_SIZE 32
+
+#define APLIC_IDC_IDELIVERY 0x00
+
+#define APLIC_IDC_IFORCE 0x04
+
+#define APLIC_IDC_ITHRESHOLD 0x08
+
+#define APLIC_IDC_TOPI 0x18
+#define APLIC_IDC_TOPI_ID_SHIFT 16
+#define APLIC_IDC_TOPI_ID_MASK 0x3ff
+#define APLIC_IDC_TOPI_PRIO_MASK 0xff
+
+#define APLIC_IDC_CLAIMI 0x1c
+
+#endif
--
2.34.1


2023-06-13 15:56:19

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 05/10] irqchip/riscv-imsic: Add support for PCI MSI irqdomain

The Linux PCI framework requires it's own dedicated MSI irqdomain so
let us create PCI MSI irqdomain as child of the IMSIC base irqdomain.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 7 +++++
drivers/irqchip/irq-riscv-imsic.c | 49 +++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 8ef18be5f37b..d700980372ef 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -550,6 +550,13 @@ config RISCV_IMSIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ

+config RISCV_IMSIC_PCI
+ bool
+ depends on RISCV_IMSIC
+ depends on PCI
+ depends on PCI_MSI
+ default RISCV_IMSIC
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
index 971fad638c9f..30247c84a6b0 100644
--- a/drivers/irqchip/irq-riscv-imsic.c
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -18,6 +18,7 @@
#include <linux/module.h>
#include <linux/msi.h>
#include <linux/of_address.h>
+#include <linux/pci.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
@@ -81,6 +82,7 @@ struct imsic_priv {

/* IRQ domains */
struct irq_domain *base_domain;
+ struct irq_domain *pci_domain;
struct irq_domain *plat_domain;
};

@@ -547,6 +549,39 @@ static const struct irq_domain_ops imsic_base_domain_ops = {
.free = imsic_irq_domain_free,
};

+#ifdef CONFIG_RISCV_IMSIC_PCI
+
+static void imsic_pci_mask_irq(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void imsic_pci_unmask_irq(struct irq_data *d)
+{
+ pci_msi_unmask_irq(d);
+ irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip imsic_pci_irq_chip = {
+ .name = "RISC-V IMSIC-PCI",
+ .irq_mask = imsic_pci_mask_irq,
+ .irq_unmask = imsic_pci_unmask_irq,
+ .irq_eoi = irq_chip_eoi_parent,
+};
+
+static struct msi_domain_ops imsic_pci_domain_ops = {
+};
+
+static struct msi_domain_info imsic_pci_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX | MSI_FLAG_MULTI_PCI_MSI),
+ .ops = &imsic_pci_domain_ops,
+ .chip = &imsic_pci_irq_chip,
+};
+
+#endif
+
static struct irq_chip imsic_plat_irq_chip = {
.name = "RISC-V IMSIC-PLAT",
};
@@ -571,12 +606,26 @@ static int __init imsic_irq_domains_init(struct fwnode_handle *fwnode)
}
irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);

+#ifdef CONFIG_RISCV_IMSIC_PCI
+ /* Create PCI MSI domain */
+ imsic->pci_domain = pci_msi_create_irq_domain(fwnode,
+ &imsic_pci_domain_info,
+ imsic->base_domain);
+ if (!imsic->pci_domain) {
+ pr_err("Failed to create IMSIC PCI domain\n");
+ irq_domain_remove(imsic->base_domain);
+ return -ENOMEM;
+ }
+#endif
+
/* Create Platform MSI domain */
imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
&imsic_plat_domain_info,
imsic->base_domain);
if (!imsic->plat_domain) {
pr_err("Failed to create IMSIC platform domain\n");
+ if (imsic->pci_domain)
+ irq_domain_remove(imsic->pci_domain);
irq_domain_remove(imsic->base_domain);
return -ENOMEM;
}
--
2.34.1


2023-06-13 15:59:27

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC

We add DT bindings document for RISC-V advanced platform level interrupt
controller (APLIC) defined by the RISC-V advanced interrupt architecture
(AIA) specification.

Signed-off-by: Anup Patel <[email protected]>
---
.../interrupt-controller/riscv,aplic.yaml | 169 ++++++++++++++++++
1 file changed, 169 insertions(+)
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
new file mode 100644
index 000000000000..e21de99b10a2
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
@@ -0,0 +1,169 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Advanced Platform Level Interrupt Controller (APLIC)
+
+maintainers:
+ - Anup Patel <[email protected]>
+
+description:
+ The RISC-V advanced interrupt architecture (AIA) defines an advanced
+ platform level interrupt controller (APLIC) for handling wired interrupts
+ in a RISC-V platform. The RISC-V AIA specification can be found at
+ https://github.com/riscv/riscv-aia.
+
+ The RISC-V APLIC is implemented as hierarchical APLIC domains where all
+ interrupt sources connect to the root APLIC domain and a parent APLIC
+ domain can delegate interrupt sources to it's child APLIC domains. There
+ is one device tree node for each APLIC domain.
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qemu,aplic
+ - const: riscv,aplic
+
+ reg:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 2
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 16384
+ description:
+ Given APLIC domain directly injects external interrupts to a set of
+ RISC-V HARTS (or CPUs). Each node pointed to should be a riscv,cpu-intc
+ node, which has a CPU node (i.e. RISC-V HART) as parent.
+
+ msi-parent:
+ description:
+ Given APLIC domain forwards wired interrupts as MSIs to a AIA incoming
+ message signaled interrupt controller (IMSIC). If both "msi-parent" and
+ "interrupts-extended" properties are present then it means the APLIC
+ domain supports both MSI mode and Direct mode in HW. In this case, the
+ APLIC driver has to choose between MSI mode or Direct mode.
+
+ riscv,num-sources:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 1023
+ description:
+ Specifies the number of wired interrupt sources supported by this
+ APLIC domain.
+
+ riscv,children:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ maxItems: 1
+ description:
+ A list of child APLIC domains for the given APLIC domain. Each child
+ APLIC domain is assigned a child index in increasing order, with the
+ first child APLIC domain assigned child index 0. The APLIC domain child
+ index is used by firmware to delegate interrupts from the given APLIC
+ domain to a particular child APLIC domain.
+
+ riscv,delegation:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 1024
+ items:
+ items:
+ - description: child APLIC domain phandle
+ - description: first interrupt number of the parent APLIC domain (inclusive)
+ - description: last interrupt number of the parent APLIC domain (inclusive)
+ description:
+ A interrupt delegation list where each entry is a triple consisting
+ of child APLIC domain phandle, first interrupt number of the parent
+ APLIC domain, and last interrupt number of the parent APLIC domain.
+ Firmware must configure interrupt delegation registers based on
+ interrupt delegation list.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - "#interrupt-cells"
+ - riscv,num-sources
+
+anyOf:
+ - required:
+ - interrupts-extended
+ - required:
+ - msi-parent
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ // Example 1 (APLIC domains directly injecting interrupt to HARTs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 11>,
+ <&cpu2_intc 11>,
+ <&cpu3_intc 11>,
+ <&cpu4_intc 11>;
+ reg = <0xc000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic1>, <&aplic2>;
+ riscv,delegation = <&aplic1 1 63>;
+ };
+
+ aplic1: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu1_intc 9>,
+ <&cpu2_intc 9>;
+ reg = <0xd000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ aplic2: interrupt-controller@e000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ interrupts-extended = <&cpu3_intc 9>,
+ <&cpu4_intc 9>;
+ reg = <0xe000000 0x4080>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+
+ - |
+ // Example 2 (APLIC domains forwarding interrupts as MSIs):
+
+ interrupt-controller@c000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_mlevel>;
+ reg = <0xc000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ riscv,children = <&aplic3>;
+ riscv,delegation = <&aplic3 1 63>;
+ };
+
+ aplic3: interrupt-controller@d000000 {
+ compatible = "qemu,aplic", "riscv,aplic";
+ msi-parent = <&imsic_slevel>;
+ reg = <0xd000000 0x4000>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ riscv,num-sources = <63>;
+ };
+...
--
2.34.1


2023-06-13 16:03:22

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support

We have a separate RISC-V IMSIC MSI address for each CPU so changing
MSI (or IRQ) affinity results in re-programming of MSI address in
the PCIe (or platform) device.

Currently, the iommu_dma_prepare_msi() is called only once at the
time of IRQ allocation so IOMMU DMA domain will only have mapping
for one MSI page. This means iommu_dma_compose_msi_msg() called
by imsic_irq_compose_msi_msg() will always use the same MSI page
irrespective to target CPU MSI address. In other words, changing
MSI (or IRQ) affinity for device using IOMMU DMA domain will not
work.

To address the above issue, we do the following:
1) Map MSI pages for all CPUs in imsic_irq_domain_alloc()
using iommu_dma_prepare_msi().
2) Extend iommu_dma_compose_msi_msg() to lookup the correct
msi_page whenever the msi_page stored as iommu cookie
does not match.

Reported-by: Vincent Chen <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
drivers/iommu/dma-iommu.c | 24 +++++++++++++++++++++---
drivers/irqchip/irq-riscv-imsic.c | 23 +++++++++++------------
2 files changed, 32 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7a9f0b0bddbd..df96bcccbe28 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1687,14 +1687,32 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
struct device *dev = msi_desc_to_dev(desc);
const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
const struct iommu_dma_msi_page *msi_page;
+ struct iommu_dma_cookie *cookie;
+ phys_addr_t msi_addr;

- msi_page = msi_desc_get_iommu_cookie(desc);
+ if (!domain || !domain->iova_cookie)
+ return;

- if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+ cookie = domain->iova_cookie;
+ msi_page = msi_desc_get_iommu_cookie(desc);
+ if (!msi_page || msi_page->phys != msi_addr) {
+ msi_addr = ((u64)msg->address_hi << 32) | msg->address_lo;
+ msi_addr &= ~(phys_addr_t)(cookie_msi_granule(cookie) - 1);
+
+ msi_desc_set_iommu_cookie(desc, NULL);
+ list_for_each_entry(msi_page, &cookie->msi_page_list, list) {
+ if (msi_page->phys == msi_addr) {
+ msi_desc_set_iommu_cookie(desc, msi_page);
+ break;
+ }
+ }
+ msi_page = msi_desc_get_iommu_cookie(desc);
+ }
+ if (WARN_ON(!msi_page))
return;

msg->address_hi = upper_32_bits(msi_page->iova);
- msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+ msg->address_lo &= cookie_msi_granule(cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
}

diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
index 30247c84a6b0..19dedd036dd4 100644
--- a/drivers/irqchip/irq-riscv-imsic.c
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -493,11 +493,18 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
int i, hwirq, err = 0;
unsigned int cpu;

- err = imsic_get_cpu(&imsic->lmask, false, &cpu);
- if (err)
- return err;
+ /* Map MSI address of all CPUs */
+ for_each_cpu(cpu, &imsic->lmask) {
+ err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ if (err)
+ return err;
+
+ err = iommu_dma_prepare_msi(info->desc, msi_addr);
+ if (err)
+ return err;
+ }

- err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ err = imsic_get_cpu(&imsic->lmask, false, &cpu);
if (err)
return err;

@@ -505,10 +512,6 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
if (hwirq < 0)
return hwirq;

- err = iommu_dma_prepare_msi(info->desc, msi_addr);
- if (err)
- goto fail;
-
for (i = 0; i < nr_irqs; i++) {
imsic_id_set_target(hwirq + i, cpu);
irq_domain_set_info(domain, virq + i, hwirq + i,
@@ -528,10 +531,6 @@ static int imsic_irq_domain_alloc(struct irq_domain *domain,
}

return 0;
-
-fail:
- imsic_ids_free(hwirq, get_count_order(nr_irqs));
- return err;
}

static void imsic_irq_domain_free(struct irq_domain *domain,
--
2.34.1


2023-06-13 16:05:59

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 09/10] RISC-V: Select APLIC and IMSIC drivers

The QEMU virt machine supports AIA emulation and we also have
quite a few RISC-V platforms with AIA support under development
so let us select APLIC and IMSIC drivers for all RISC-V platforms.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
arch/riscv/Kconfig | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ff37d8ebe989..19233d59be37 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -136,6 +136,8 @@ config RISCV
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
select RISCV_ALTERNATIVE if !XIP_KERNEL
+ select RISCV_APLIC
+ select RISCV_IMSIC
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
--
2.34.1


2023-06-13 16:06:24

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 02/10] irqchip/riscv-intc: Add support for RISC-V AIA

The RISC-V advanced interrupt architecture (AIA) extends the per-HART
local interrupts in following ways:
1. Minimum 64 local interrupts for both RV32 and RV64
2. Ability to process multiple pending local interrupts in same
interrupt handler
3. Priority configuration for each local interrupts
4. Special CSRs to configure/access the per-HART MSI controller

This patch adds support for RISC-V AIA in the RISC-V intc driver.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/irq-riscv-intc.c | 36 ++++++++++++++++++++++++++------
1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 4adeee1bc391..e235bf1708a4 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/of.h>
#include <linux/smp.h>
+#include <asm/hwcap.h>

static struct irq_domain *intc_domain;

@@ -30,6 +31,15 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
generic_handle_domain_irq(intc_domain, cause);
}

+static asmlinkage void riscv_intc_aia_irq(struct pt_regs *regs)
+{
+ unsigned long topi;
+
+ while ((topi = csr_read(CSR_TOPI)))
+ generic_handle_domain_irq(intc_domain,
+ topi >> TOPI_IID_SHIFT);
+}
+
/*
* On RISC-V systems local interrupts are masked or unmasked by writing
* the SIE (Supervisor Interrupt Enable) CSR. As CSRs can only be written
@@ -39,12 +49,18 @@ static asmlinkage void riscv_intc_irq(struct pt_regs *regs)

static void riscv_intc_irq_mask(struct irq_data *d)
{
- csr_clear(CSR_IE, BIT(d->hwirq));
+ if (d->hwirq < BITS_PER_LONG)
+ csr_clear(CSR_IE, BIT(d->hwirq));
+ else
+ csr_clear(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
}

static void riscv_intc_irq_unmask(struct irq_data *d)
{
- csr_set(CSR_IE, BIT(d->hwirq));
+ if (d->hwirq < BITS_PER_LONG)
+ csr_set(CSR_IE, BIT(d->hwirq));
+ else
+ csr_set(CSR_IEH, BIT(d->hwirq - BITS_PER_LONG));
}

static void riscv_intc_irq_eoi(struct irq_data *d)
@@ -115,16 +131,22 @@ static struct fwnode_handle *riscv_intc_hwnode(void)

static int __init riscv_intc_init_common(struct fwnode_handle *fn)
{
- int rc;
+ int rc, nr_irqs = BITS_PER_LONG;
+
+ if (riscv_isa_extension_available(NULL, SxAIA) && BITS_PER_LONG == 32)
+ nr_irqs = nr_irqs * 2;

- intc_domain = irq_domain_create_linear(fn, BITS_PER_LONG,
+ intc_domain = irq_domain_create_linear(fn, nr_irqs,
&riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
}

- rc = set_handle_irq(&riscv_intc_irq);
+ if (riscv_isa_extension_available(NULL, SxAIA))
+ rc = set_handle_irq(&riscv_intc_aia_irq);
+ else
+ rc = set_handle_irq(&riscv_intc_irq);
if (rc) {
pr_err("failed to set irq handler\n");
return rc;
@@ -132,7 +154,9 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)

riscv_set_intc_hwnode_fn(riscv_intc_hwnode);

- pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+ pr_info("%d local interrupts mapped%s\n",
+ nr_irqs, (riscv_isa_extension_available(NULL, SxAIA)) ?
+ " using AIA" : "");

return 0;
}
--
2.34.1


2023-06-13 16:06:30

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 01/10] RISC-V: Add riscv_fw_parent_hartid() function

We add common riscv_fw_parent_hartid() which help device drivers
to get parent hartid of the INTC (i.e. local interrupt controller)
fwnode. This should work for both DT and ACPI.

Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/processor.h | 3 +++
arch/riscv/kernel/cpu.c | 16 ++++++++++++++++
2 files changed, 19 insertions(+)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 94a0590c6971..6fb8bbec8459 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -77,6 +77,9 @@ struct device_node;
int riscv_of_processor_hartid(struct device_node *node, unsigned long *hartid);
int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid);

+struct fwnode_handle;
+int riscv_fw_parent_hartid(struct fwnode_handle *node, unsigned long *hartid);
+
extern void riscv_fill_hwcap(void);
extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 5de6fb703cc2..67b335789b22 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -73,6 +73,22 @@ int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid)
return -1;
}

+/* Find hart ID of the CPU fwnode under which given fwnode falls. */
+int riscv_fw_parent_hartid(struct fwnode_handle *node, unsigned long *hartid)
+{
+ int rc;
+ u64 temp;
+
+ if (!is_of_node(node)) {
+ rc = fwnode_property_read_u64_array(node, "hartid", &temp, 1);
+ if (!rc)
+ *hartid = temp;
+ } else
+ rc = riscv_of_parent_hartid(to_of_node(node), hartid);
+
+ return rc;
+}
+
DEFINE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);

unsigned long riscv_cached_mvendorid(unsigned int cpu_id)
--
2.34.1


2023-06-13 16:06:36

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 04/10] irqchip: Add RISC-V incoming MSI controller driver

The RISC-V advanced interrupt architecture (AIA) specification defines
a new MSI controller for managing MSIs and IPIs on a RISC-V platform.

This new MSI controller is referred to as incoming message signalled
interrupt controller (IMSIC) which manages MSI on per-HART (or per-CPU)
basis. (For more details refer https://github.com/riscv/riscv-aia)

This patch adds an irqchip driver for RISC-V IMSIC which provides
IPIs and platform MSIs to the Linux RISC-V kernel.

Signed-off-by: Anup Patel <[email protected]>
---
drivers/irqchip/Kconfig | 7 +-
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-riscv-imsic.c | 1028 +++++++++++++++++++++++++++
include/linux/irqchip/riscv-imsic.h | 86 +++
4 files changed, 1121 insertions(+), 1 deletion(-)
create mode 100644 drivers/irqchip/irq-riscv-imsic.c
create mode 100644 include/linux/irqchip/riscv-imsic.h

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 09e422da482f..8ef18be5f37b 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -30,7 +30,6 @@ config ARM_GIC_V2M

config GIC_NON_BANKED
bool
-
config ARM_GIC_V3
bool
select IRQ_DOMAIN_HIERARCHY
@@ -545,6 +544,12 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP

+config RISCV_IMSIC
+ bool
+ depends on RISCV
+ select IRQ_DOMAIN_HIERARCHY
+ select GENERIC_MSI_IRQ
+
config EXYNOS_IRQ_COMBINER
bool "Samsung Exynos IRQ combiner support" if COMPILE_TEST
depends on (ARCH_EXYNOS && ARM) || COMPILE_TEST
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index ffd945fe71aa..577bde3e986b 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
+obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
diff --git a/drivers/irqchip/irq-riscv-imsic.c b/drivers/irqchip/irq-riscv-imsic.c
new file mode 100644
index 000000000000..971fad638c9f
--- /dev/null
+++ b/drivers/irqchip/irq-riscv-imsic.c
@@ -0,0 +1,1028 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-imsic: " fmt
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqchip/riscv-imsic.h>
+#include <linux/irqdomain.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of_address.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <asm/hwcap.h>
+
+#define IMSIC_DISABLE_EIDELIVERY 0
+#define IMSIC_ENABLE_EIDELIVERY 1
+#define IMSIC_DISABLE_EITHRESHOLD 1
+#define IMSIC_ENABLE_EITHRESHOLD 0
+
+/*
+ * The IMSIC driver uses 1 IPI for ID synchronization and
+ * arch/riscv/kernel/smp.c require 6 IPIs so we fix the
+ * total number of IPIs to 8.
+ */
+#define IMSIC_NR_IPI 8
+
+#define imsic_csr_write(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_write(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_read(__c) \
+({ \
+ unsigned long __v; \
+ csr_write(CSR_ISELECT, __c); \
+ __v = csr_read(CSR_IREG); \
+ __v; \
+})
+
+#define imsic_csr_set(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_set(CSR_IREG, __v); \
+} while (0)
+
+#define imsic_csr_clear(__c, __v) \
+do { \
+ csr_write(CSR_ISELECT, __c); \
+ csr_clear(CSR_IREG, __v); \
+} while (0)
+
+struct imsic_priv {
+ /* Global configuration common for all HARTs */
+ struct imsic_global_config global;
+
+ /* Global state of interrupt identities */
+ raw_spinlock_t ids_lock;
+ unsigned long *ids_used_bimap;
+ unsigned long *ids_enabled_bimap;
+ unsigned int *ids_target_cpu;
+
+ /* Mask for connected CPUs */
+ struct cpumask lmask;
+
+ /* IPI interrupt identity and synchronization */
+ u32 ipi_id;
+ int ipi_virq;
+ struct irq_desc *ipi_lsync_desc;
+
+ /* IRQ domains */
+ struct irq_domain *base_domain;
+ struct irq_domain *plat_domain;
+};
+
+static struct imsic_priv *imsic;
+static int imsic_parent_irq;
+
+const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return (imsic) ? &imsic->global : NULL;
+}
+EXPORT_SYMBOL_GPL(imsic_get_global_config);
+
+static int imsic_cpu_page_phys(unsigned int cpu,
+ unsigned int guest_index,
+ phys_addr_t *out_msi_pa)
+{
+ struct imsic_global_config *global;
+ struct imsic_local_config *local;
+
+ global = &imsic->global;
+ local = per_cpu_ptr(global->local, cpu);
+
+ if (BIT(global->guest_index_bits) <= guest_index)
+ return -EINVAL;
+
+ if (out_msi_pa)
+ *out_msi_pa = local->msi_pa +
+ (guest_index * IMSIC_MMIO_PAGE_SZ);
+
+ return 0;
+}
+
+static int imsic_get_cpu(const struct cpumask *mask_val, bool force,
+ unsigned int *out_target_cpu)
+{
+ struct cpumask amask;
+ unsigned int cpu;
+
+ cpumask_and(&amask, &imsic->lmask, mask_val);
+
+ if (force)
+ cpu = cpumask_first(&amask);
+ else
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ if (out_target_cpu)
+ *out_target_cpu = cpu;
+
+ return 0;
+}
+
+static void imsic_id_set_target(unsigned int id, unsigned int target_cpu)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ imsic->ids_target_cpu[id] = target_cpu;
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static unsigned int imsic_id_get_target(unsigned int id)
+{
+ unsigned int ret;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ ret = imsic->ids_target_cpu[id];
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+ return ret;
+}
+
+static void __imsic_eix_update(unsigned long base_id,
+ unsigned long num_id, bool pend, bool val)
+{
+ unsigned long i, isel, ireg;
+ unsigned long id = base_id, last_id = base_id + num_id;
+
+ while (id < last_id) {
+ isel = id / BITS_PER_LONG;
+ isel *= BITS_PER_LONG / IMSIC_EIPx_BITS;
+ isel += (pend) ? IMSIC_EIP0 : IMSIC_EIE0;
+
+ ireg = 0;
+ for (i = id & (__riscv_xlen - 1);
+ (id < last_id) && (i < __riscv_xlen); i++) {
+ ireg |= BIT(i);
+ id++;
+ }
+
+ /*
+ * The IMSIC EIEx and EIPx registers are indirectly
+ * accessed via using ISELECT and IREG CSRs so we
+ * need to access these CSRs without getting preempted.
+ *
+ * All existing users of this function call this
+ * function with local IRQs disabled so we don't
+ * need to do anything special here.
+ */
+ if (val)
+ imsic_csr_set(isel, ireg);
+ else
+ imsic_csr_clear(isel, ireg);
+ }
+}
+
+#define __imsic_id_enable(__id) \
+ __imsic_eix_update((__id), 1, false, true)
+#define __imsic_id_disable(__id) \
+ __imsic_eix_update((__id), 1, false, false)
+
+static void imsic_ids_local_sync(void)
+{
+ int i;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ for (i = 1; i <= imsic->global.nr_ids; i++) {
+ if (imsic->ipi_id == i)
+ continue;
+
+ if (test_bit(i, imsic->ids_enabled_bimap))
+ __imsic_id_enable(i);
+ else
+ __imsic_id_disable(i);
+ }
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static void imsic_ids_local_delivery(bool enable)
+{
+ if (enable) {
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_ENABLE_EITHRESHOLD);
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_ENABLE_EIDELIVERY);
+ } else {
+ imsic_csr_write(IMSIC_EIDELIVERY, IMSIC_DISABLE_EIDELIVERY);
+ imsic_csr_write(IMSIC_EITHRESHOLD, IMSIC_DISABLE_EITHRESHOLD);
+ }
+}
+
+#ifdef CONFIG_SMP
+static irqreturn_t imsic_ids_sync_handler(int irq, void *data)
+{
+ imsic_ids_local_sync();
+ return IRQ_HANDLED;
+}
+
+static void imsic_ids_remote_sync(void)
+{
+ struct cpumask amask;
+
+ /*
+ * We simply inject ID synchronization IPI to all target CPUs
+ * except current CPU. The ipi_send_mask() implementation of
+ * IPI mux will inject ID synchronization IPI only for CPUs
+ * that have enabled it so offline CPUs won't receive IPI.
+ * An offline CPU will unconditionally synchronize IDs through
+ * imsic_starting_cpu() when the CPU is brought up.
+ */
+ cpumask_andnot(&amask, &imsic->lmask, cpumask_of(smp_processor_id()));
+ __ipi_send_mask(imsic->ipi_lsync_desc, &amask);
+}
+#else
+#define imsic_ids_remote_sync()
+#endif
+
+static int imsic_ids_alloc(unsigned int order)
+{
+ int ret;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ ret = bitmap_find_free_region(imsic->ids_used_bimap,
+ imsic->global.nr_ids + 1, order);
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+ return ret;
+}
+
+static void imsic_ids_free(unsigned int base_id, unsigned int order)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ bitmap_release_region(imsic->ids_used_bimap, base_id, order);
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+}
+
+static int __init imsic_ids_init(void)
+{
+ int i;
+ struct imsic_global_config *global = &imsic->global;
+
+ raw_spin_lock_init(&imsic->ids_lock);
+
+ /* Allocate used bitmap */
+ imsic->ids_used_bimap = bitmap_zalloc(global->nr_ids + 1, GFP_KERNEL);
+ if (!imsic->ids_used_bimap)
+ return -ENOMEM;
+
+ /* Allocate enabled bitmap */
+ imsic->ids_enabled_bimap = bitmap_zalloc(global->nr_ids + 1,
+ GFP_KERNEL);
+ if (!imsic->ids_enabled_bimap) {
+ kfree(imsic->ids_used_bimap);
+ return -ENOMEM;
+ }
+
+ /* Allocate target CPU array */
+ imsic->ids_target_cpu = kcalloc(global->nr_ids + 1,
+ sizeof(unsigned int), GFP_KERNEL);
+ if (!imsic->ids_target_cpu) {
+ bitmap_free(imsic->ids_enabled_bimap);
+ bitmap_free(imsic->ids_used_bimap);
+ return -ENOMEM;
+ }
+ for (i = 0; i <= global->nr_ids; i++)
+ imsic->ids_target_cpu[i] = UINT_MAX;
+
+ /* Reserve ID#0 because it is special and never implemented */
+ bitmap_set(imsic->ids_used_bimap, 0, 1);
+
+ return 0;
+}
+
+static void __init imsic_ids_cleanup(void)
+{
+ kfree(imsic->ids_target_cpu);
+ bitmap_free(imsic->ids_enabled_bimap);
+ bitmap_free(imsic->ids_used_bimap);
+}
+
+#ifdef CONFIG_SMP
+static void imsic_ipi_send(unsigned int cpu)
+{
+ struct imsic_local_config *local =
+ per_cpu_ptr(imsic->global.local, cpu);
+
+ writel(imsic->ipi_id, local->msi_va);
+}
+
+static void imsic_ipi_starting_cpu(void)
+{
+ /* Enable IPIs for current CPU. */
+ __imsic_id_enable(imsic->ipi_id);
+
+ /* Enable virtual IPI used for IMSIC ID synchronization */
+ enable_percpu_irq(imsic->ipi_virq, 0);
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+ /*
+ * Disable virtual IPI used for IMSIC ID synchronization so
+ * that we don't receive ID synchronization requests.
+ */
+ disable_percpu_irq(imsic->ipi_virq);
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ int virq;
+
+ /* Allocate interrupt identity for IPIs */
+ virq = imsic_ids_alloc(get_count_order(1));
+ if (virq < 0)
+ return virq;
+ imsic->ipi_id = virq;
+
+ /* Create IMSIC IPI multiplexing */
+ virq = ipi_mux_create(IMSIC_NR_IPI, imsic_ipi_send);
+ if (virq <= 0) {
+ imsic_ids_free(imsic->ipi_id, get_count_order(1));
+ return (virq < 0) ? virq : -ENOMEM;
+ }
+ imsic->ipi_virq = virq;
+
+ /* First vIRQ is used for IMSIC ID synchronization */
+ virq = request_percpu_irq(imsic->ipi_virq, imsic_ids_sync_handler,
+ "riscv-imsic-lsync", imsic->global.local);
+ if (virq) {
+ imsic_ids_free(imsic->ipi_id, get_count_order(1));
+ return virq;
+ }
+ irq_set_status_flags(imsic->ipi_virq, IRQ_HIDDEN);
+ imsic->ipi_lsync_desc = irq_to_desc(imsic->ipi_virq);
+
+ /* Set vIRQ range */
+ riscv_ipi_set_virq_range(imsic->ipi_virq + 1, IMSIC_NR_IPI - 1, true);
+
+ return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(void)
+{
+ if (imsic->ipi_lsync_desc)
+ free_percpu_irq(imsic->ipi_virq, imsic->global.local);
+ imsic_ids_free(imsic->ipi_id, get_count_order(1));
+}
+#else
+static void imsic_ipi_starting_cpu(void)
+{
+}
+
+static void imsic_ipi_dying_cpu(void)
+{
+}
+
+static int __init imsic_ipi_domain_init(void)
+{
+ /* Clear the IPI id because we are not using IPIs */
+ imsic->ipi_id = 0;
+ return 0;
+}
+
+static void __init imsic_ipi_domain_cleanup(void)
+{
+}
+#endif
+
+static void imsic_irq_mask(struct irq_data *d)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ bitmap_clear(imsic->ids_enabled_bimap, d->hwirq, 1);
+ __imsic_id_disable(d->hwirq);
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+ imsic_ids_remote_sync();
+}
+
+static void imsic_irq_unmask(struct irq_data *d)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&imsic->ids_lock, flags);
+ bitmap_set(imsic->ids_enabled_bimap, d->hwirq, 1);
+ __imsic_id_enable(d->hwirq);
+ raw_spin_unlock_irqrestore(&imsic->ids_lock, flags);
+
+ imsic_ids_remote_sync();
+}
+
+static void imsic_irq_compose_msi_msg(struct irq_data *d,
+ struct msi_msg *msg)
+{
+ struct msi_desc *desc = irq_data_get_msi_desc(d);
+ phys_addr_t msi_addr;
+ unsigned int cpu;
+ int err;
+
+ cpu = imsic_id_get_target(d->hwirq);
+ if (WARN_ON(cpu == UINT_MAX))
+ return;
+
+ err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ if (WARN_ON(err))
+ return;
+
+ msg->address_hi = upper_32_bits(msi_addr);
+ msg->address_lo = lower_32_bits(msi_addr);
+ msg->data = d->hwirq;
+ iommu_dma_compose_msi_msg(desc, msg);
+}
+
+#ifdef CONFIG_SMP
+static int imsic_irq_set_affinity(struct irq_data *d,
+ const struct cpumask *mask_val,
+ bool force)
+{
+ unsigned int target_cpu;
+ int rc;
+
+ rc = imsic_get_cpu(mask_val, force, &target_cpu);
+ if (rc)
+ return rc;
+
+ imsic_id_set_target(d->hwirq, target_cpu);
+ irq_data_update_effective_affinity(d, cpumask_of(target_cpu));
+
+ return IRQ_SET_MASK_OK;
+}
+#endif
+
+static struct irq_chip imsic_irq_base_chip = {
+ .name = "RISC-V IMSIC-BASE",
+ .irq_mask = imsic_irq_mask,
+ .irq_unmask = imsic_irq_unmask,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = imsic_irq_set_affinity,
+#endif
+ .irq_compose_msi_msg = imsic_irq_compose_msi_msg,
+ .flags = IRQCHIP_SKIP_SET_WAKE |
+ IRQCHIP_MASK_ON_SUSPEND,
+};
+
+static int imsic_irq_domain_alloc(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs,
+ void *args)
+{
+ msi_alloc_info_t *info = args;
+ phys_addr_t msi_addr;
+ int i, hwirq, err = 0;
+ unsigned int cpu;
+
+ err = imsic_get_cpu(&imsic->lmask, false, &cpu);
+ if (err)
+ return err;
+
+ err = imsic_cpu_page_phys(cpu, 0, &msi_addr);
+ if (err)
+ return err;
+
+ hwirq = imsic_ids_alloc(get_count_order(nr_irqs));
+ if (hwirq < 0)
+ return hwirq;
+
+ err = iommu_dma_prepare_msi(info->desc, msi_addr);
+ if (err)
+ goto fail;
+
+ for (i = 0; i < nr_irqs; i++) {
+ imsic_id_set_target(hwirq + i, cpu);
+ irq_domain_set_info(domain, virq + i, hwirq + i,
+ &imsic_irq_base_chip, imsic,
+ handle_simple_irq, NULL, NULL);
+ irq_set_noprobe(virq + i);
+ irq_set_affinity(virq + i, &imsic->lmask);
+ /*
+ * IMSIC does not implement irq_disable() so Linux interrupt
+ * subsystem will take a lazy approach for disabling an IMSIC
+ * interrupt. This means IMSIC interrupts are left unmasked
+ * upon system suspend and interrupts are not processed
+ * immediately upon system wake up. To tackle this, we disable
+ * the lazy approach for all IMSIC interrupts.
+ */
+ irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
+ }
+
+ return 0;
+
+fail:
+ imsic_ids_free(hwirq, get_count_order(nr_irqs));
+ return err;
+}
+
+static void imsic_irq_domain_free(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+
+ imsic_ids_free(d->hwirq, get_count_order(nr_irqs));
+ irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops imsic_base_domain_ops = {
+ .alloc = imsic_irq_domain_alloc,
+ .free = imsic_irq_domain_free,
+};
+
+static struct irq_chip imsic_plat_irq_chip = {
+ .name = "RISC-V IMSIC-PLAT",
+};
+
+static struct msi_domain_ops imsic_plat_domain_ops = {
+};
+
+static struct msi_domain_info imsic_plat_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+ .ops = &imsic_plat_domain_ops,
+ .chip = &imsic_plat_irq_chip,
+};
+
+static int __init imsic_irq_domains_init(struct fwnode_handle *fwnode)
+{
+ /* Create Base IRQ domain */
+ imsic->base_domain = irq_domain_create_tree(fwnode,
+ &imsic_base_domain_ops, imsic);
+ if (!imsic->base_domain) {
+ pr_err("Failed to create IMSIC base domain\n");
+ return -ENOMEM;
+ }
+ irq_domain_update_bus_token(imsic->base_domain, DOMAIN_BUS_NEXUS);
+
+ /* Create Platform MSI domain */
+ imsic->plat_domain = platform_msi_create_irq_domain(fwnode,
+ &imsic_plat_domain_info,
+ imsic->base_domain);
+ if (!imsic->plat_domain) {
+ pr_err("Failed to create IMSIC platform domain\n");
+ irq_domain_remove(imsic->base_domain);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+/*
+ * To handle an interrupt, we read the TOPEI CSR and write zero in one
+ * instruction. If TOPEI CSR is non-zero then we translate TOPEI.ID to
+ * Linux interrupt number and let Linux IRQ subsystem handle it.
+ */
+static void imsic_handle_irq(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ irq_hw_number_t hwirq;
+ int err;
+
+ chained_irq_enter(chip, desc);
+
+ while ((hwirq = csr_swap(CSR_TOPEI, 0))) {
+ hwirq = hwirq >> TOPEI_ID_SHIFT;
+
+ if (hwirq == imsic->ipi_id) {
+#ifdef CONFIG_SMP
+ ipi_mux_process();
+#endif
+ continue;
+ }
+
+ err = generic_handle_domain_irq(imsic->base_domain, hwirq);
+ if (unlikely(err))
+ pr_warn_ratelimited(
+ "hwirq %lu mapping not found\n", hwirq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int imsic_starting_cpu(unsigned int cpu)
+{
+ /* Enable per-CPU parent interrupt */
+ enable_percpu_irq(imsic_parent_irq,
+ irq_get_trigger_type(imsic_parent_irq));
+
+ /* Setup IPIs */
+ imsic_ipi_starting_cpu();
+
+ /*
+ * Interrupts identities might have been enabled/disabled while
+ * this CPU was not running so sync-up local enable/disable state.
+ */
+ imsic_ids_local_sync();
+
+ /* Enable local interrupt delivery */
+ imsic_ids_local_delivery(true);
+
+ return 0;
+}
+
+static int imsic_dying_cpu(unsigned int cpu)
+{
+ /* Cleanup IPIs */
+ imsic_ipi_dying_cpu();
+
+ return 0;
+}
+
+static int __init imsic_get_parent_hartid(struct fwnode_handle *fwnode,
+ u32 index, unsigned long *hartid)
+{
+ int rc;
+ struct fwnode_reference_args parent;
+
+ rc = fwnode_property_get_reference_args(fwnode,
+ "interrupts-extended", "#interrupt-cells",
+ 0, index, &parent);
+ if (rc)
+ return rc;
+
+ /*
+ * Skip interrupts other than external interrupts for
+ * current privilege level.
+ */
+ if (parent.args[0] != RV_IRQ_EXT)
+ return -EINVAL;
+
+ return riscv_fw_parent_hartid(parent.fwnode, hartid);
+}
+
+static int __init imsic_get_mmio_resource(struct fwnode_handle *fwnode,
+ u32 index, struct resource *res)
+{
+ /*
+ * Currently, only OF fwnode is support so extend this function
+ * for other types of fwnode for ACPI support.
+ */
+ if (!is_of_node(fwnode))
+ return -EINVAL;
+ return of_address_to_resource(to_of_node(fwnode), index, res);
+}
+
+static int __init imsic_init(struct fwnode_handle *fwnode)
+{
+ int rc, cpu;
+ phys_addr_t base_addr;
+ struct irq_domain *domain;
+ void __iomem **mmios_va = NULL;
+ struct resource res, *mmios = NULL;
+ struct imsic_local_config *local;
+ struct imsic_global_config *global;
+ unsigned long reloff, hartid;
+ u32 i, j, index, nr_parent_irqs, nr_handlers = 0, num_mmios = 0;
+
+ /*
+ * Only one IMSIC instance allowed in a platform for clean
+ * implementation of SMP IRQ affinity and per-CPU IPIs.
+ *
+ * This means on a multi-socket (or multi-die) platform we
+ * will have multiple MMIO regions for one IMSIC instance.
+ */
+ if (imsic) {
+ pr_err("%pfwP: already initialized hence ignoring\n",
+ fwnode);
+ return -ENODEV;
+ }
+
+ if (!riscv_isa_extension_available(NULL, SxAIA)) {
+ pr_err("%pfwP: AIA support not available\n", fwnode);
+ return -ENODEV;
+ }
+
+ imsic = kzalloc(sizeof(*imsic), GFP_KERNEL);
+ if (!imsic)
+ return -ENOMEM;
+ global = &imsic->global;
+
+ global->local = alloc_percpu(typeof(*(global->local)));
+ if (!global->local) {
+ rc = -ENOMEM;
+ goto out_free_priv;
+ }
+
+ /* Find number of parent interrupts */
+ nr_parent_irqs = 0;
+ while (!imsic_get_parent_hartid(fwnode, nr_parent_irqs, &hartid))
+ nr_parent_irqs++;
+ if (!nr_parent_irqs) {
+ pr_err("%pfwP: no parent irqs available\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Find number of guest index bits in MSI address */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,guest-index-bits",
+ &global->guest_index_bits, 1);
+ if (rc)
+ global->guest_index_bits = 0;
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+ if (i < global->guest_index_bits) {
+ pr_err("%pfwP: guest index bits too big\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Find number of HART index bits */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,hart-index-bits",
+ &global->hart_index_bits, 1);
+ if (rc) {
+ /* Assume default value */
+ global->hart_index_bits = __fls(nr_parent_irqs);
+ if (BIT(global->hart_index_bits) < nr_parent_irqs)
+ global->hart_index_bits++;
+ }
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT - global->guest_index_bits;
+ if (i < global->hart_index_bits) {
+ pr_err("%pfwP: HART index bits too big\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Find number of group index bits */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,group-index-bits",
+ &global->group_index_bits, 1);
+ if (rc)
+ global->group_index_bits = 0;
+ i = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT -
+ global->guest_index_bits - global->hart_index_bits;
+ if (i < global->group_index_bits) {
+ pr_err("%pfwP: group index bits too big\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /*
+ * Find first bit position of group index.
+ * If not specified assumed the default APLIC-IMSIC configuration.
+ */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,group-index-shift",
+ &global->group_index_shift, 1);
+ if (rc)
+ global->group_index_shift = IMSIC_MMIO_PAGE_SHIFT * 2;
+ i = global->group_index_bits + global->group_index_shift - 1;
+ if (i >= BITS_PER_LONG) {
+ pr_err("%pfwP: group index shift too big\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Find number of interrupt identities */
+ rc = fwnode_property_read_u32_array(fwnode, "riscv,num-ids",
+ &global->nr_ids, 1);
+ if (rc) {
+ pr_err("%pfwP: number of interrupt identities not found\n",
+ fwnode);
+ goto out_free_local;
+ }
+ if ((global->nr_ids < IMSIC_MIN_ID) ||
+ (global->nr_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of interrupt identities\n",
+ fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Find number of guest interrupt identities */
+ if (fwnode_property_read_u32_array(fwnode, "riscv,num-guest-ids",
+ &global->nr_guest_ids, 1))
+ global->nr_guest_ids = global->nr_ids;
+ if ((global->nr_guest_ids < IMSIC_MIN_ID) ||
+ (global->nr_guest_ids >= IMSIC_MAX_ID) ||
+ ((global->nr_guest_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID)) {
+ pr_err("%pfwP: invalid number of guest interrupt identities\n",
+ fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+
+ /* Compute base address */
+ rc = imsic_get_mmio_resource(fwnode, 0, &res);
+ if (rc) {
+ pr_err("%pfwP: first MMIO resource not found\n", fwnode);
+ rc = -EINVAL;
+ goto out_free_local;
+ }
+ global->base_addr = res.start;
+ global->base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ global->base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+
+ /* Find number of MMIO register sets */
+ while (!imsic_get_mmio_resource(fwnode, num_mmios, &res))
+ num_mmios++;
+
+ /* Allocate MMIO resource array */
+ mmios = kcalloc(num_mmios, sizeof(*mmios), GFP_KERNEL);
+ if (!mmios) {
+ rc = -ENOMEM;
+ goto out_free_local;
+ }
+
+ /* Allocate MMIO virtual address array */
+ mmios_va = kcalloc(num_mmios, sizeof(*mmios_va), GFP_KERNEL);
+ if (!mmios_va) {
+ rc = -ENOMEM;
+ goto out_iounmap;
+ }
+
+ /* Parse and map MMIO register sets */
+ for (i = 0; i < num_mmios; i++) {
+ rc = imsic_get_mmio_resource(fwnode, i, &mmios[i]);
+ if (rc) {
+ pr_err("%pfwP: unable to parse MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ base_addr = mmios[i].start;
+ base_addr &= ~(BIT(global->guest_index_bits +
+ global->hart_index_bits +
+ IMSIC_MMIO_PAGE_SHIFT) - 1);
+ base_addr &= ~((BIT(global->group_index_bits) - 1) <<
+ global->group_index_shift);
+ if (base_addr != global->base_addr) {
+ rc = -EINVAL;
+ pr_err("%pfwP: address mismatch for regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+
+ mmios_va[i] = ioremap(mmios[i].start, resource_size(&mmios[i]));
+ if (!mmios_va[i]) {
+ rc = -EIO;
+ pr_err("%pfwP: unable to map MMIO regset %d\n",
+ fwnode, i);
+ goto out_iounmap;
+ }
+ }
+
+ /* Initialize interrupt identity management */
+ rc = imsic_ids_init();
+ if (rc) {
+ pr_err("%pfwP: failed to initialize interrupt management\n",
+ fwnode);
+ goto out_iounmap;
+ }
+
+ /* Configure handlers for target CPUs */
+ for (i = 0; i < nr_parent_irqs; i++) {
+ rc = imsic_get_parent_hartid(fwnode, i, &hartid);
+ if (rc) {
+ pr_warn("%pfwP: hart ID for parent irq%d not found\n",
+ fwnode, i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
+ if (cpu < 0) {
+ pr_warn("%pfwP: invalid cpuid for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ /* Find MMIO location of MSI page */
+ index = num_mmios;
+ reloff = i * BIT(global->guest_index_bits) *
+ IMSIC_MMIO_PAGE_SZ;
+ for (j = 0; num_mmios; j++) {
+ if (reloff < resource_size(&mmios[j])) {
+ index = j;
+ break;
+ }
+
+ /*
+ * MMIO region size may not be aligned to
+ * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+ * if holes are present.
+ */
+ reloff -= ALIGN(resource_size(&mmios[j]),
+ BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ);
+ }
+ if (index >= num_mmios) {
+ pr_warn("%pfwP: MMIO not found for parent irq%d\n",
+ fwnode, i);
+ continue;
+ }
+
+ cpumask_set_cpu(cpu, &imsic->lmask);
+
+ local = per_cpu_ptr(global->local, cpu);
+ local->msi_pa = mmios[index].start + reloff;
+ local->msi_va = mmios_va[index] + reloff;
+
+ nr_handlers++;
+ }
+
+ /* If no CPU handlers found then can't take interrupts */
+ if (!nr_handlers) {
+ pr_err("%pfwP: No CPU handlers found\n", fwnode);
+ rc = -ENODEV;
+ goto out_ids_cleanup;
+ }
+
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+ DOMAIN_BUS_ANY);
+ if (!domain) {
+ pr_err("%pfwP: Failed to find INTC domain\n", fwnode);
+ rc = -ENOENT;
+ goto out_ids_cleanup;
+ }
+ imsic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (!imsic_parent_irq) {
+ pr_err("%pfwP: Failed to create INTC mapping\n", fwnode);
+ rc = -ENOENT;
+ goto out_ids_cleanup;
+ }
+ irq_set_chained_handler(imsic_parent_irq, imsic_handle_irq);
+
+ /* Initialize IPI domain */
+ rc = imsic_ipi_domain_init();
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IPI domain\n", fwnode);
+ goto out_ids_cleanup;
+ }
+
+ /* Initialize IRQ and MSI domains */
+ rc = imsic_irq_domains_init(fwnode);
+ if (rc) {
+ pr_err("%pfwP: Failed to initialize IRQ and MSI domains\n",
+ fwnode);
+ goto out_ipi_domain_cleanup;
+ }
+
+ /*
+ * Setup cpuhp state (must be done after setting imsic_parent_irq)
+ *
+ * Don't disable per-CPU IMSIC file when CPU goes offline
+ * because this affects IPI and the masking/unmasking of
+ * virtual IPIs is done via generic IPI-Mux
+ */
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "irqchip/riscv/imsic:starting",
+ imsic_starting_cpu, imsic_dying_cpu);
+
+ /* We don't need MMIO arrays anymore so let's free-up */
+ kfree(mmios_va);
+ kfree(mmios);
+
+ pr_info("%pfwP: hart-index-bits: %d, guest-index-bits: %d\n",
+ fwnode, global->hart_index_bits, global->guest_index_bits);
+ pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n",
+ fwnode, global->group_index_bits, global->group_index_shift);
+ pr_info("%pfwP: mapped %d interrupts for %d CPUs at %pa\n",
+ fwnode, global->nr_ids, nr_handlers, &global->base_addr);
+ if (imsic->ipi_id)
+ pr_info("%pfwP: providing IPIs using interrupt %d\n",
+ fwnode, imsic->ipi_id);
+
+ return 0;
+
+out_ipi_domain_cleanup:
+ imsic_ipi_domain_cleanup();
+out_ids_cleanup:
+ imsic_ids_cleanup();
+out_iounmap:
+ for (i = 0; i < num_mmios; i++) {
+ if (mmios_va[i])
+ iounmap(mmios_va[i]);
+ }
+ kfree(mmios_va);
+ kfree(mmios);
+out_free_local:
+ free_percpu(imsic->global.local);
+out_free_priv:
+ kfree(imsic);
+ imsic = NULL;
+ return rc;
+}
+
+static int __init imsic_dt_init(struct device_node *node,
+ struct device_node *parent)
+{
+ return imsic_init(&node->fwnode);
+}
+IRQCHIP_DECLARE(riscv_imsic, "riscv,imsics", imsic_dt_init);
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
new file mode 100644
index 000000000000..1f6fc9a57218
--- /dev/null
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+#ifndef __LINUX_IRQCHIP_RISCV_IMSIC_H
+#define __LINUX_IRQCHIP_RISCV_IMSIC_H
+
+#include <linux/types.h>
+#include <asm/csr.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT 12
+#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
+#define IMSIC_MMIO_PAGE_LE 0x00
+#define IMSIC_MMIO_PAGE_BE 0x04
+
+#define IMSIC_MIN_ID 63
+#define IMSIC_MAX_ID 2048
+
+#define IMSIC_EIDELIVERY 0x70
+
+#define IMSIC_EITHRESHOLD 0x72
+
+#define IMSIC_EIP0 0x80
+#define IMSIC_EIP63 0xbf
+#define IMSIC_EIPx_BITS 32
+
+#define IMSIC_EIE0 0xc0
+#define IMSIC_EIE63 0xff
+#define IMSIC_EIEx_BITS 32
+
+#define IMSIC_FIRST IMSIC_EIDELIVERY
+#define IMSIC_LAST IMSIC_EIE63
+
+#define IMSIC_MMIO_SETIPNUM_LE 0x00
+#define IMSIC_MMIO_SETIPNUM_BE 0x04
+
+struct imsic_local_config {
+ phys_addr_t msi_pa;
+ void __iomem *msi_va;
+};
+
+struct imsic_global_config {
+ /*
+ * MSI Target Address Scheme
+ *
+ * XLEN-1 12 0
+ * | | |
+ * -------------------------------------------------------------
+ * |xxxxxx|Group Index|xxxxxxxxxxx|HART Index|Guest Index| 0 |
+ * -------------------------------------------------------------
+ */
+
+ /* Bits representing Guest index, HART index, and Group index */
+ u32 guest_index_bits;
+ u32 hart_index_bits;
+ u32 group_index_bits;
+ u32 group_index_shift;
+
+ /* Global base address matching all target MSI addresses */
+ phys_addr_t base_addr;
+
+ /* Number of interrupt identities */
+ u32 nr_ids;
+
+ /* Number of guest interrupt identities */
+ u32 nr_guest_ids;
+
+ /* Per-CPU IMSIC addresses */
+ struct imsic_local_config __percpu *local;
+};
+
+#ifdef CONFIG_RISCV_IMSIC
+
+extern const struct imsic_global_config *imsic_get_global_config(void);
+
+#else
+
+static inline const struct imsic_global_config *imsic_get_global_config(void)
+{
+ return NULL;
+}
+
+#endif
+
+#endif
--
2.34.1


2023-06-13 16:06:50

by Anup Patel

[permalink] [raw]
Subject: [PATCH v4 10/10] MAINTAINERS: Add entry for RISC-V AIA drivers

Add myself as maintainer for RISC-V AIA drivers including the
RISC-V INTC driver which supports both AIA and non-AIA platforms.

Signed-off-by: Anup Patel <[email protected]>
---
MAINTAINERS | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 51da90e60004..2d474eb902fa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18136,6 +18136,18 @@ S: Maintained
F: drivers/mtd/nand/raw/r852.c
F: drivers/mtd/nand/raw/r852.h

+RISC-V AIA DRIVERS
+M: Anup Patel <[email protected]>
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml
+F: Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml
+F: drivers/irqchip/irq-riscv-aplic.c
+F: drivers/irqchip/irq-riscv-imsic.c
+F: drivers/irqchip/irq-riscv-intc.c
+F: include/linux/irqchip/riscv-aplic.h
+F: include/linux/irqchip/riscv-imsic.h
+
RISC-V ARCHITECTURE
M: Paul Walmsley <[email protected]>
M: Palmer Dabbelt <[email protected]>
--
2.34.1


2023-06-14 15:27:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support

On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> We have a separate RISC-V IMSIC MSI address for each CPU so changing
> MSI (or IRQ) affinity results in re-programming of MSI address in
> the PCIe (or platform) device.
>
> Currently, the iommu_dma_prepare_msi() is called only once at the
> time of IRQ allocation so IOMMU DMA domain will only have mapping
> for one MSI page. This means iommu_dma_compose_msi_msg() called
> by imsic_irq_compose_msi_msg() will always use the same MSI page
> irrespective to target CPU MSI address. In other words, changing
> MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> work.

You didn't answer my question from last time - there seems to be no
iommu driver here so why are you messing with iommu_dma_prepare_msi()?

This path is only for platforms that have IOMMU drivers that translate
the MSI window. You should add this code to link the interrupt
controller to the iommu driver when you introduce the iommu driver,
not in this series?

And, as I said before, I'd like to NOT see new users of
iommu_dma_prepare_msi() since it is a very problematic API.

This hacking of it here is not making it better :(

Jason

2023-06-14 17:12:10

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support

On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <[email protected]> wrote:
>
> On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > MSI (or IRQ) affinity results in re-programming of MSI address in
> > the PCIe (or platform) device.
> >
> > Currently, the iommu_dma_prepare_msi() is called only once at the
> > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > irrespective to target CPU MSI address. In other words, changing
> > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > work.
>
> You didn't answer my question from last time - there seems to be no
> iommu driver here so why are you messing with iommu_dma_prepare_msi()?
>
> This path is only for platforms that have IOMMU drivers that translate
> the MSI window. You should add this code to link the interrupt
> controller to the iommu driver when you introduce the iommu driver,
> not in this series?
>
> And, as I said before, I'd like to NOT see new users of
> iommu_dma_prepare_msi() since it is a very problematic API.
>
> This hacking of it here is not making it better :(

I misunderstood your previous comments.

We can certainly deal with this later when the IOMMU
driver is available for RISC-V. I will drop this patch in the
next revision.

Regards,
Anup

2023-06-14 18:02:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support

On Wed, Jun 14, 2023 at 09:47:53PM +0530, Anup Patel wrote:
> On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <[email protected]> wrote:
> >
> > On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > > MSI (or IRQ) affinity results in re-programming of MSI address in
> > > the PCIe (or platform) device.
> > >
> > > Currently, the iommu_dma_prepare_msi() is called only once at the
> > > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > > irrespective to target CPU MSI address. In other words, changing
> > > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > > work.
> >
> > You didn't answer my question from last time - there seems to be no
> > iommu driver here so why are you messing with iommu_dma_prepare_msi()?
> >
> > This path is only for platforms that have IOMMU drivers that translate
> > the MSI window. You should add this code to link the interrupt
> > controller to the iommu driver when you introduce the iommu driver,
> > not in this series?
> >
> > And, as I said before, I'd like to NOT see new users of
> > iommu_dma_prepare_msi() since it is a very problematic API.
> >
> > This hacking of it here is not making it better :(
>
> I misunderstood your previous comments.
>
> We can certainly deal with this later when the IOMMU
> driver is available for RISC-V. I will drop this patch in the
> next revision.

Not only just this patch but the calls to iommu_dma_prepare_msi() and
related APIs in the prior patch too. Assume the MSI window is directly
visible to DMA without translation.

When you come with an iommu driver we can discuss how best to proceed.

Thanks,
Jason

2023-06-14 19:46:16

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC

Hey Anup,

Mostly looks good, once minor comment.

On Tue, Jun 13, 2023 at 09:04:12PM +0530, Anup Patel wrote:

> + riscv,children:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + maxItems: 1
> + description:
> + A list of child APLIC domains for the given APLIC domain. Each child
> + APLIC domain is assigned a child index in increasing order, with the
> + first child APLIC domain assigned child index 0. The APLIC domain child
> + index is used by firmware to delegate interrupts from the given APLIC
> + domain to a particular child APLIC domain.
> +
> + riscv,delegation:
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + minItems: 1
> + maxItems: 1024
> + items:
> + items:
> + - description: child APLIC domain phandle
> + - description: first interrupt number of the parent APLIC domain (inclusive)
> + - description: last interrupt number of the parent APLIC domain (inclusive)
> + description:
> + A interrupt delegation list where each entry is a triple consisting
> + of child APLIC domain phandle, first interrupt number of the parent
> + APLIC domain, and last interrupt number of the parent APLIC domain.
> + Firmware must configure interrupt delegation registers based on
> + interrupt delegation list.
> +
> +required:
> + - compatible
> + - reg
> + - interrupt-controller
> + - "#interrupt-cells"
> + - riscv,num-sources
> +
> +anyOf:
> + - required:
> + - interrupts-extended
> + - required:
> + - msi-parent

Not sure if you missed this from the last version, but I asked if we
needed a
dependencies:
riscv,delegate: [ riscv,children ]

IOW, I don't think it is valid to have a delegation without having
children?

Otherwise,
Reviewed-by: Conor Dooley <[email protected]>

Cheers,
Conor.


Attachments:
(No filename) (1.94 kB)
signature.asc (235.00 B)
Download all attachments

2023-06-15 05:51:59

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 07/10] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC

On Thu, Jun 15, 2023 at 12:57 AM Conor Dooley <[email protected]> wrote:
>
> Hey Anup,
>
> Mostly looks good, once minor comment.
>
> On Tue, Jun 13, 2023 at 09:04:12PM +0530, Anup Patel wrote:
>
> > + riscv,children:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
> > + items:
> > + maxItems: 1
> > + description:
> > + A list of child APLIC domains for the given APLIC domain. Each child
> > + APLIC domain is assigned a child index in increasing order, with the
> > + first child APLIC domain assigned child index 0. The APLIC domain child
> > + index is used by firmware to delegate interrupts from the given APLIC
> > + domain to a particular child APLIC domain.
> > +
> > + riscv,delegation:
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
> > + minItems: 1
> > + maxItems: 1024
> > + items:
> > + items:
> > + - description: child APLIC domain phandle
> > + - description: first interrupt number of the parent APLIC domain (inclusive)
> > + - description: last interrupt number of the parent APLIC domain (inclusive)
> > + description:
> > + A interrupt delegation list where each entry is a triple consisting
> > + of child APLIC domain phandle, first interrupt number of the parent
> > + APLIC domain, and last interrupt number of the parent APLIC domain.
> > + Firmware must configure interrupt delegation registers based on
> > + interrupt delegation list.
> > +
> > +required:
> > + - compatible
> > + - reg
> > + - interrupt-controller
> > + - "#interrupt-cells"
> > + - riscv,num-sources
> > +
> > +anyOf:
> > + - required:
> > + - interrupts-extended
> > + - required:
> > + - msi-parent
>
> Not sure if you missed this from the last version, but I asked if we
> needed a
> dependencies:
> riscv,delegate: [ riscv,children ]
>
> IOW, I don't think it is valid to have a delegation without having
> children?

Ahh, yes. I missed this one. I will update in the next revision.

>
> Otherwise,
> Reviewed-by: Conor Dooley <[email protected]>
>
> Cheers,
> Conor.

Regards,
Anup

2023-06-15 05:59:46

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 06/10] irqchip/riscv-imsic: Improve IOMMU DMA support

On Wed, Jun 14, 2023 at 10:20 PM Jason Gunthorpe <[email protected]> wrote:
>
> On Wed, Jun 14, 2023 at 09:47:53PM +0530, Anup Patel wrote:
> > On Wed, Jun 14, 2023 at 8:16 PM Jason Gunthorpe <[email protected]> wrote:
> > >
> > > On Tue, Jun 13, 2023 at 09:04:11PM +0530, Anup Patel wrote:
> > > > We have a separate RISC-V IMSIC MSI address for each CPU so changing
> > > > MSI (or IRQ) affinity results in re-programming of MSI address in
> > > > the PCIe (or platform) device.
> > > >
> > > > Currently, the iommu_dma_prepare_msi() is called only once at the
> > > > time of IRQ allocation so IOMMU DMA domain will only have mapping
> > > > for one MSI page. This means iommu_dma_compose_msi_msg() called
> > > > by imsic_irq_compose_msi_msg() will always use the same MSI page
> > > > irrespective to target CPU MSI address. In other words, changing
> > > > MSI (or IRQ) affinity for device using IOMMU DMA domain will not
> > > > work.
> > >
> > > You didn't answer my question from last time - there seems to be no
> > > iommu driver here so why are you messing with iommu_dma_prepare_msi()?
> > >
> > > This path is only for platforms that have IOMMU drivers that translate
> > > the MSI window. You should add this code to link the interrupt
> > > controller to the iommu driver when you introduce the iommu driver,
> > > not in this series?
> > >
> > > And, as I said before, I'd like to NOT see new users of
> > > iommu_dma_prepare_msi() since it is a very problematic API.
> > >
> > > This hacking of it here is not making it better :(
> >
> > I misunderstood your previous comments.
> >
> > We can certainly deal with this later when the IOMMU
> > driver is available for RISC-V. I will drop this patch in the
> > next revision.
>
> Not only just this patch but the calls to iommu_dma_prepare_msi() and
> related APIs in the prior patch too. Assume the MSI window is directly
> visible to DMA without translation.

Okay, I will remove iommu_dma_xyz() usage from IMSIC driver in the
next revision.

>
> When you come with an iommu driver we can discuss how best to proceed.

Yes, that's better.

Regards,
Anup

2023-06-15 19:32:21

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
>
> The RISC-V advanced interrupt architecture (AIA) specification defines
> a new interrupt controller for managing wired interrupts on a RISC-V
> platform. This new interrupt controller is referred to as advanced
> platform-level interrupt controller (APLIC) which can forward wired
> interrupts to CPUs (or HARTs) as local interrupts OR as message
> signaled interrupts.
> (For more details refer https://github.com/riscv/riscv-aia)
>
> This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> platforms.
>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> drivers/irqchip/Kconfig | 6 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> include/linux/irqchip/riscv-aplic.h | 119 +++++
> 4 files changed, 891 insertions(+)
> create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> create mode 100644 include/linux/irqchip/riscv-aplic.h
>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index d700980372ef..834c0329f583 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> select IRQ_DOMAIN_HIERARCHY
> select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
>
> +config RISCV_APLIC
> + bool
> + depends on RISCV
> + select IRQ_DOMAIN_HIERARCHY
> + select GENERIC_MSI_IRQ
> +
> config RISCV_IMSIC
> bool
> depends on RISCV
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 577bde3e986b..438b8e1a152c 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> new file mode 100644
> index 000000000000..1e710fdf5608
> --- /dev/null
> +++ b/drivers/irqchip/irq-riscv-aplic.c
> @@ -0,0 +1,765 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#define pr_fmt(fmt) "riscv-aplic: " fmt
> +#include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/chained_irq.h>
> +#include <linux/irqchip/riscv-aplic.h>
> +#include <linux/irqchip/riscv-imsic.h>
> +#include <linux/irqdomain.h>
> +#include <linux/module.h>
> +#include <linux/msi.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#define APLIC_DEFAULT_PRIORITY 1
> +#define APLIC_DISABLE_IDELIVERY 0
> +#define APLIC_ENABLE_IDELIVERY 1
> +#define APLIC_DISABLE_ITHRESHOLD 1
> +#define APLIC_ENABLE_ITHRESHOLD 0
> +
> +struct aplic_msicfg {
> + phys_addr_t base_ppn;
> + u32 hhxs;
> + u32 hhxw;
> + u32 lhxs;
> + u32 lhxw;
> +};
> +
> +struct aplic_idc {
> + unsigned int hart_index;
> + void __iomem *regs;
> + struct aplic_priv *priv;
> +};
> +
> +struct aplic_priv {
> + struct fwnode_handle *fwnode;
> + u32 gsi_base;
> + u32 nr_irqs;
> + u32 nr_idcs;
> + void __iomem *regs;
> + struct irq_domain *irqdomain;
> + struct aplic_msicfg msicfg;
> + struct cpumask lmask;
> +};
> +
> +static unsigned int aplic_idc_parent_irq;
> +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> +
> +static void aplic_irq_unmask(struct irq_data *d)
> +{
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> +
> + if (!priv->nr_idcs)
> + irq_chip_unmask_parent(d);
> +}
> +
> +static void aplic_irq_mask(struct irq_data *d)
> +{
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> +
> + if (!priv->nr_idcs)
> + irq_chip_mask_parent(d);
> +}
> +
> +static int aplic_set_type(struct irq_data *d, unsigned int type)
> +{
> + u32 val = 0;
> + void __iomem *sourcecfg;
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + switch (type) {
> + case IRQ_TYPE_NONE:
> + val = APLIC_SOURCECFG_SM_INACTIVE;
> + break;
> + case IRQ_TYPE_LEVEL_LOW:
> + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> + break;
> + case IRQ_TYPE_LEVEL_HIGH:
> + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> + break;
> + case IRQ_TYPE_EDGE_FALLING:
> + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> + break;
> + case IRQ_TYPE_EDGE_RISING:
> + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> + writel(val, sourcecfg);
> +
> + return 0;
> +}
> +
> +static void aplic_irq_eoi(struct irq_data *d)
> +{
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> + u32 reg_off, reg_mask;
> +
> + /*
> + * EOI handling only required only for level-triggered
> + * interrupts in APLIC MSI mode.
> + */
> +
> + if (priv->nr_idcs)
> + return;
> +
> + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> + switch (irqd_get_trigger_type(d)) {
> + case IRQ_TYPE_LEVEL_LOW:
> + if (!(readl(priv->regs + reg_off) & reg_mask))
> + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> + break;
> + case IRQ_TYPE_LEVEL_HIGH:
> + if (readl(priv->regs + reg_off) & reg_mask)
> + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> + break;
> + }
> +}
> +
> +#ifdef CONFIG_SMP
> +static int aplic_set_affinity(struct irq_data *d,
> + const struct cpumask *mask_val, bool force)
> +{
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> + struct aplic_idc *idc;
> + unsigned int cpu, val;
> + struct cpumask amask;
> + void __iomem *target;
> +
> + if (!priv->nr_idcs)
> + return irq_chip_set_affinity_parent(d, mask_val, force);
> +
> + cpumask_and(&amask, &priv->lmask, mask_val);
> +
> + if (force)
> + cpu = cpumask_first(&amask);
> + else
> + cpu = cpumask_any_and(&amask, cpu_online_mask);
> +
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> +
> + idc = per_cpu_ptr(&aplic_idcs, cpu);
> + target = priv->regs + APLIC_TARGET_BASE;
> + target += (d->hwirq - 1) * sizeof(u32);
> + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> + val |= APLIC_DEFAULT_PRIORITY;
> + writel(val, target);
> +
> + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> + return IRQ_SET_MASK_OK_DONE;
> +}
> +#endif
> +
> +static struct irq_chip aplic_chip = {
> + .name = "RISC-V APLIC",
> + .irq_mask = aplic_irq_mask,
> + .irq_unmask = aplic_irq_unmask,
> + .irq_set_type = aplic_set_type,
> + .irq_eoi = aplic_irq_eoi,
> +#ifdef CONFIG_SMP
> + .irq_set_affinity = aplic_set_affinity,
> +#endif
> + .flags = IRQCHIP_SET_TYPE_MASKED |
> + IRQCHIP_SKIP_SET_WAKE |
> + IRQCHIP_MASK_ON_SUSPEND,
> +};
> +
> +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> + u32 gsi_base,
> + unsigned long *hwirq,
> + unsigned int *type)
> +{
> + if (WARN_ON(fwspec->param_count < 2))
> + return -EINVAL;
> + if (WARN_ON(!fwspec->param[0]))
> + return -EINVAL;
> +
> + /* For DT, gsi_base is always zero. */
> + *hwirq = fwspec->param[0] - gsi_base;
> + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> +
> + WARN_ON(*type == IRQ_TYPE_NONE);
> +
> + return 0;
> +}
> +
> +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> + struct irq_fwspec *fwspec,
> + unsigned long *hwirq,
> + unsigned int *type)
> +{
> + struct aplic_priv *priv = platform_msi_get_host_data(d);
> +
> + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> +}
> +
> +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> + unsigned int virq, unsigned int nr_irqs,
> + void *arg)
> +{
> + int i, ret;
> + unsigned int type;
> + irq_hw_number_t hwirq;
> + struct irq_fwspec *fwspec = arg;
> + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> +
> + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> + if (ret)
> + return ret;
> +
> + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> + if (ret)
> + return ret;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + irq_domain_set_info(domain, virq + i, hwirq + i,
> + &aplic_chip, priv, handle_fasteoi_irq,
> + NULL, NULL);
> + /*
> + * APLIC does not implement irq_disable() so Linux interrupt
> + * subsystem will take a lazy approach for disabling an APLIC
> + * interrupt. This means APLIC interrupts are left unmasked
> + * upon system suspend and interrupts are not processed
> + * immediately upon system wake up. To tackle this, we disable
> + * the lazy approach for all APLIC interrupts.
> + */
> + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> + }
> +
> + return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> + .translate = aplic_irqdomain_msi_translate,
> + .alloc = aplic_irqdomain_msi_alloc,
> + .free = platform_msi_device_domain_free,
> +};
> +
> +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> + struct irq_fwspec *fwspec,
> + unsigned long *hwirq,
> + unsigned int *type)
> +{
> + struct aplic_priv *priv = d->host_data;
> +
> + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> +}
> +
> +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> + unsigned int virq, unsigned int nr_irqs,
> + void *arg)
> +{
> + int i, ret;
> + unsigned int type;
> + irq_hw_number_t hwirq;
> + struct irq_fwspec *fwspec = arg;
> + struct aplic_priv *priv = domain->host_data;
> +
> + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> + if (ret)
> + return ret;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + irq_domain_set_info(domain, virq + i, hwirq + i,
> + &aplic_chip, priv, handle_fasteoi_irq,
> + NULL, NULL);
> + irq_set_affinity(virq + i, &priv->lmask);
> + /* See the reason described in aplic_irqdomain_msi_alloc() */
> + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> + }
> +
> + return 0;
> +}
> +
> +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> + .translate = aplic_irqdomain_idc_translate,
> + .alloc = aplic_irqdomain_idc_alloc,
> + .free = irq_domain_free_irqs_top,
> +};
> +
> +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> +{
> + int i;
> +
> + /* Disable all interrupts */
> + for (i = 0; i <= priv->nr_irqs; i += 32)
> + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> + (i / 32) * sizeof(u32));
> +
> + /* Set interrupt type and default priority for all interrupts */
> + for (i = 1; i <= priv->nr_irqs; i++) {
> + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> + (i - 1) * sizeof(u32));
> + writel(APLIC_DEFAULT_PRIORITY,
> + priv->regs + APLIC_TARGET_BASE +
> + (i - 1) * sizeof(u32));
> + }
> +
> + /* Clear APLIC domaincfg */
> + writel(0, priv->regs + APLIC_DOMAINCFG);
> +}
> +
> +static void aplic_init_hw_global(struct aplic_priv *priv)
> +{
> + u32 val;
> +#ifdef CONFIG_RISCV_M_MODE
> + u32 valH;
> +
> + if (!priv->nr_idcs) {
> + val = priv->msicfg.base_ppn;
> + valH = (priv->msicfg.base_ppn >> 32) &
> + APLIC_xMSICFGADDRH_BAPPN_MASK;
> + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> + writel(val, priv->regs + APLIC_xMSICFGADDR);
> + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> + }
> +#endif
> +
> + /* Setup APLIC domaincfg register */
> + val = readl(priv->regs + APLIC_DOMAINCFG);
> + val |= APLIC_DOMAINCFG_IE;
> + if (!priv->nr_idcs)
> + val |= APLIC_DOMAINCFG_DM;
> + writel(val, priv->regs + APLIC_DOMAINCFG);
> + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> + priv->fwnode, val);
> +}
> +
> +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> + unsigned int group_index, hart_index, guest_index, val;
> + struct irq_data *d = irq_get_irq_data(desc->irq);
> + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> + struct aplic_msicfg *mc = &priv->msicfg;
> + phys_addr_t tppn, tbppn, msg_addr;
> + void __iomem *target;
> +
> + /* For zeroed MSI, simply write zero into the target register */
> + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> + target = priv->regs + APLIC_TARGET_BASE;
> + target += (d->hwirq - 1) * sizeof(u32);
> + writel(0, target);
> + return;
> + }
> +
> + /* Sanity check on message data */
> + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> +
> + /* Compute target MSI address */
> + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> +
> + /* Compute target HART Base PPN */
> + tbppn = tppn;
> + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> + WARN_ON(tbppn != mc->base_ppn);
> +
> + /* Compute target group and hart indexes */
> + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> + hart_index |= (group_index << mc->lhxw);
> + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> +
> + /* Compute target guest index */
> + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> +
> + /* Update IRQ TARGET register */
> + target = priv->regs + APLIC_TARGET_BASE;
> + target += (d->hwirq - 1) * sizeof(u32);
> + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> + << APLIC_TARGET_HART_IDX_SHIFT;
> + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> + << APLIC_TARGET_GUEST_IDX_SHIFT;
> + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> + writel(val, target);
> +}
> +
> +static int aplic_setup_msi(struct aplic_priv *priv)
> +{
> + struct aplic_msicfg *mc = &priv->msicfg;
> + const struct imsic_global_config *imsic_global;
> +
> + /*
> + * The APLIC outgoing MSI config registers assume target MSI
> + * controller to be RISC-V AIA IMSIC controller.
> + */
> + imsic_global = imsic_get_global_config();
> + if (!imsic_global) {
> + pr_err("%pfwP: IMSIC global config not found\n",
> + priv->fwnode);
> + return -ENODEV;
> + }
> +
> + /* Find number of guest index bits (LHXS) */
> + mc->lhxs = imsic_global->guest_index_bits;
> + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> + priv->fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of HART index bits (LHXW) */
> + mc->lhxw = imsic_global->hart_index_bits;
> + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> + priv->fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find number of group index bits (HHXW) */
> + mc->hhxw = imsic_global->group_index_bits;
> + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> + priv->fwnode);
> + return -EINVAL;
> + }
> +
> + /* Find first bit position of group index (HHXS) */
> + mc->hhxs = imsic_global->group_index_shift;
> + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> + return -EINVAL;
> + }
> + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> + priv->fwnode);
> + return -EINVAL;
> + }
> +
> + /* Compute PPN base */
> + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> +
> + /* Use all possible CPUs as lmask */
> + cpumask_copy(&priv->lmask, cpu_possible_mask);
> +
> + return 0;
> +}
> +
> +/*
> + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> + * which will return highest priority pending interrupt and clear the
> + * pending bit of the interrupt. This process is repeated until CLAIMI
> + * register return zero value.
> + */
> +static void aplic_idc_handle_irq(struct irq_desc *desc)
> +{
> + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + irq_hw_number_t hw_irq;
> + int irq;
> +
> + chained_irq_enter(chip, desc);
> +
> + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> +
> + if (unlikely(irq <= 0))
> + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> + hw_irq);
> + else
> + generic_handle_irq(irq);
> + }
> +
> + chained_irq_exit(chip, desc);
> +}
> +
> +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> +{
> + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> +
> + /* Priority must be less than threshold for interrupt triggering */
> + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> +
> + /* Delivery must be set to 1 for interrupt triggering */
> + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> +}
> +
> +static int aplic_idc_dying_cpu(unsigned int cpu)
> +{
> + if (aplic_idc_parent_irq)
> + disable_percpu_irq(aplic_idc_parent_irq);
> +
> + return 0;
> +}
> +
> +static int aplic_idc_starting_cpu(unsigned int cpu)
> +{
> + if (aplic_idc_parent_irq)
> + enable_percpu_irq(aplic_idc_parent_irq,
> + irq_get_trigger_type(aplic_idc_parent_irq));
> +
> + return 0;
> +}
> +
> +static int aplic_setup_idc(struct aplic_priv *priv)
> +{
> + int i, j, rc, cpu, setup_count = 0;
> + struct fwnode_reference_args parent;
> + struct irq_domain *domain;
> + unsigned long hartid;
> + struct aplic_idc *idc;
> + u32 val;
> +
> + /* Setup per-CPU IDC and target CPU mask */
> + for (i = 0; i < priv->nr_idcs; i++) {
> + rc = fwnode_property_get_reference_args(priv->fwnode,
> + "interrupts-extended", "#interrupt-cells",
> + 0, i, &parent);
> + if (rc) {
> + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> + priv->fwnode, i);
> + continue;
> + }
> +
> + /*
> + * Skip interrupts other than external interrupts for
> + * current privilege level.
> + */
> + if (parent.args[0] != RV_IRQ_EXT)
> + continue;
> +
> + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> + if (rc) {
> + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> + priv->fwnode, i);
> + continue;
> + }
> +
> + cpu = riscv_hartid_to_cpuid(hartid);
> + if (cpu < 0) {
> + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> + priv->fwnode, i);
> + continue;
> + }
> +
> + cpumask_set_cpu(cpu, &priv->lmask);
> +
> + idc = per_cpu_ptr(&aplic_idcs, cpu);
> + idc->hart_index = i;
> + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> + idc->priv = priv;
> +
> + aplic_idc_set_delivery(idc, true);
> +
> + /*
> + * Boot cpu might not have APLIC hart_index = 0 so check
> + * and update target registers of all interrupts.
> + */
> + if (cpu == smp_processor_id() && idc->hart_index) {
> + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> + val |= APLIC_DEFAULT_PRIORITY;
> + for (j = 1; j <= priv->nr_irqs; j++)
> + writel(val, priv->regs + APLIC_TARGET_BASE +
> + (j - 1) * sizeof(u32));
> + }
> +
> + setup_count++;
> + }
> +
> + /* Find parent domain and register chained handler */
> + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> + DOMAIN_BUS_ANY);
> + if (!aplic_idc_parent_irq && domain) {
> + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> + if (aplic_idc_parent_irq) {
> + irq_set_chained_handler(aplic_idc_parent_irq,
> + aplic_idc_handle_irq);
> +
> + /*
> + * Setup CPUHP notifier to enable IDC parent
> + * interrupt on all CPUs
> + */
> + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> + "irqchip/riscv/aplic:starting",
> + aplic_idc_starting_cpu,
> + aplic_idc_dying_cpu);
> + }
> + }
> +
> + /* Fail if we were not able to setup IDC for any CPU */
> + return (setup_count) ? 0 : -ENODEV;
> +}
> +
> +static int aplic_probe(struct platform_device *pdev)
> +{
> + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> + struct fwnode_reference_args parent;
> + struct aplic_priv *priv;
> + struct resource *res;
> + phys_addr_t pa;
> + int rc;
> +
> + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> + priv->fwnode = fwnode;
> +
> + /* Map the MMIO registers */
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!res) {
> + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> + return -EINVAL;
> + }
> + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> + if (!priv->regs) {
> + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> + return -ENOMEM;
> + }
> +
> + /*
> + * Find out GSI base number
> + *
> + * Note: DT does not define "riscv,gsi-base" property so GSI
> + * base is always zero for DT.
> + */
> + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> + &priv->gsi_base, 1);
> + if (rc)
> + priv->gsi_base = 0;
> +
> + /* Find out number of interrupt sources */
> + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> + &priv->nr_irqs, 1);
> + if (rc) {
> + pr_err("%pfwP: failed to get number of interrupt sources\n",
> + fwnode);
> + return rc;
> + }
> +
> + /* Setup initial state APLIC interrupts */
> + aplic_init_hw_irqs(priv);
> +
> + /*
> + * Find out number of IDCs based on parent interrupts
> + *
> + * If "msi-parent" property is present then we ignore the
> + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> + */
> + if (!fwnode_property_present(fwnode, "msi-parent")) {
> + while (!fwnode_property_get_reference_args(fwnode,
> + "interrupts-extended", "#interrupt-cells",
> + 0, priv->nr_idcs, &parent))
> + priv->nr_idcs++;
> + }
> +
> + /* Setup IDCs or MSIs based on number of IDCs */
> + if (priv->nr_idcs)
> + rc = aplic_setup_idc(priv);
> + else
> + rc = aplic_setup_msi(priv);
> + if (rc) {
> + pr_err("%pfwP: failed setup %s\n",
> + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> + return rc;
> + }
> +
> + /* Setup global config and interrupt delivery */
> + aplic_init_hw_global(priv);
> +
> + /* Create irq domain instance for the APLIC */
> + if (priv->nr_idcs)
> + priv->irqdomain = irq_domain_create_linear(
> + priv->fwnode,
> + priv->nr_irqs + 1,
> + &aplic_irqdomain_idc_ops,
> + priv);
> + else
> + priv->irqdomain = platform_msi_create_device_domain(
> + &pdev->dev,
> + priv->nr_irqs + 1,
> + aplic_msi_write_msg,
> + &aplic_irqdomain_msi_ops,
> + priv);
> + if (!priv->irqdomain) {
> + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> + return -ENOMEM;
> + }
> +
> + /* Advertise the interrupt controller */
> + if (priv->nr_idcs) {
> + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> + } else {
> + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> + priv->fwnode, priv->nr_irqs, &pa);
> + }
> +
> + return 0;
> +}
> +
> +static const struct of_device_id aplic_match[] = {
> + { .compatible = "riscv,aplic" },
> + {}
> +};
> +
> +static struct platform_driver aplic_driver = {
> + .driver = {
> + .name = "riscv-aplic",
> + .of_match_table = aplic_match,
> + },
> + .probe = aplic_probe,
> +};
> +builtin_platform_driver(aplic_driver);
> +
> +static int __init aplic_dt_init(struct device_node *node,
> + struct device_node *parent)
> +{
> + /*
> + * The APLIC platform driver needs to be probed early
> + * so for device tree:
> + *
> + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> + * provides a hint to the device driver core to probe the
> + * platform driver early.
> + * 2) Clear the OF_POPULATED flag in device_node because
> + * of_irq_init() sets it which prevents creation of
> + * platform device.
> + */
> + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;

NACK. You are blindly plastering flags without trying to understand
the real issue and fixing this correctly.

> + of_node_clear_flag(node, OF_POPULATED);
> + return 0;
> +}
> +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);

This macro pretty much skips the entire driver core framework to probe
and calls init and you are supposed to initialize the device when the
init function is called.

If you want your device/driver to follow the proper platform driver
path (which is recommended), then you need to use the
IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.

I offered to help you debug this issue and I asked for a dts file that
corresponds to a board you are testing this on and seeing an issue.
But you haven't answered my question [1] and are pointing to some
random commit and blaming it. That commit has no impact on any
existing devices/drivers.

Hi Marc,

Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
is used or until Anup actually works with us to debug the real issue.

-Saravana
[1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/


> diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> new file mode 100644
> index 000000000000..97e198ea0109
> --- /dev/null
> +++ b/include/linux/irqchip/riscv-aplic.h
> @@ -0,0 +1,119 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> +
> +#include <linux/bitops.h>
> +
> +#define APLIC_MAX_IDC BIT(14)
> +#define APLIC_MAX_SOURCE 1024
> +
> +#define APLIC_DOMAINCFG 0x0000
> +#define APLIC_DOMAINCFG_RDONLY 0x80000000
> +#define APLIC_DOMAINCFG_IE BIT(8)
> +#define APLIC_DOMAINCFG_DM BIT(2)
> +#define APLIC_DOMAINCFG_BE BIT(0)
> +
> +#define APLIC_SOURCECFG_BASE 0x0004
> +#define APLIC_SOURCECFG_D BIT(10)
> +#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
> +#define APLIC_SOURCECFG_SM_MASK 0x00000007
> +#define APLIC_SOURCECFG_SM_INACTIVE 0x0
> +#define APLIC_SOURCECFG_SM_DETACH 0x1
> +#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
> +#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
> +#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
> +
> +#define APLIC_MMSICFGADDR 0x1bc0
> +#define APLIC_MMSICFGADDRH 0x1bc4
> +#define APLIC_SMSICFGADDR 0x1bc8
> +#define APLIC_SMSICFGADDRH 0x1bcc
> +
> +#ifdef CONFIG_RISCV_M_MODE
> +#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
> +#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
> +#else
> +#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
> +#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
> +#endif
> +
> +#define APLIC_xMSICFGADDRH_L BIT(31)
> +#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
> +#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
> +#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
> +#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
> +#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
> +#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
> +
> +#define APLIC_xMSICFGADDR_PPN_SHIFT 12
> +
> +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> + (BIT(__lhxs) - 1)
> +
> +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> + (BIT(__lhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> + ((__lhxs))
> +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> + (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> + APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> +
> +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> + (BIT(__hhxw) - 1)
> +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> + ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> + (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> + APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> +
> +#define APLIC_IRQBITS_PER_REG 32
> +
> +#define APLIC_SETIP_BASE 0x1c00
> +#define APLIC_SETIPNUM 0x1cdc
> +
> +#define APLIC_CLRIP_BASE 0x1d00
> +#define APLIC_CLRIPNUM 0x1ddc
> +
> +#define APLIC_SETIE_BASE 0x1e00
> +#define APLIC_SETIENUM 0x1edc
> +
> +#define APLIC_CLRIE_BASE 0x1f00
> +#define APLIC_CLRIENUM 0x1fdc
> +
> +#define APLIC_SETIPNUM_LE 0x2000
> +#define APLIC_SETIPNUM_BE 0x2004
> +
> +#define APLIC_GENMSI 0x3000
> +
> +#define APLIC_TARGET_BASE 0x3004
> +#define APLIC_TARGET_HART_IDX_SHIFT 18
> +#define APLIC_TARGET_HART_IDX_MASK 0x3fff
> +#define APLIC_TARGET_GUEST_IDX_SHIFT 12
> +#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
> +#define APLIC_TARGET_IPRIO_MASK 0xff
> +#define APLIC_TARGET_EIID_MASK 0x7ff
> +
> +#define APLIC_IDC_BASE 0x4000
> +#define APLIC_IDC_SIZE 32
> +
> +#define APLIC_IDC_IDELIVERY 0x00
> +
> +#define APLIC_IDC_IFORCE 0x04
> +
> +#define APLIC_IDC_ITHRESHOLD 0x08
> +
> +#define APLIC_IDC_TOPI 0x18
> +#define APLIC_IDC_TOPI_ID_SHIFT 16
> +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> +#define APLIC_IDC_TOPI_PRIO_MASK 0xff
> +
> +#define APLIC_IDC_CLAIMI 0x1c
> +
> +#endif
> --
> 2.34.1
>

2023-06-15 19:41:35

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

Hey Saravana,

On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:

btw, please try to delete the 100s of lines of unrelated context when
replying

> > +static int __init aplic_dt_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + /*
> > + * The APLIC platform driver needs to be probed early
> > + * so for device tree:
> > + *
> > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > + * provides a hint to the device driver core to probe the
> > + * platform driver early.
> > + * 2) Clear the OF_POPULATED flag in device_node because
> > + * of_irq_init() sets it which prevents creation of
> > + * platform device.
> > + */
> > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
>
> NACK. You are blindly plastering flags without trying to understand
> the real issue and fixing this correctly.
>
> > + of_node_clear_flag(node, OF_POPULATED);
> > + return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
>
> This macro pretty much skips the entire driver core framework to probe
> and calls init and you are supposed to initialize the device when the
> init function is called.
>
> If you want your device/driver to follow the proper platform driver
> path (which is recommended), then you need to use the
> IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
>
> I offered to help you debug this issue and I asked for a dts file that
> corresponds to a board you are testing this on and seeing an issue.

There isn't a dts file for this because there's no publicly available
hardware that actually has an APLIC. Maybe Ventana have pre-production
silicon that has it, but otherwise it's a QEMU job.

Cheers,
Conor.

> But you haven't answered my question [1] and are pointing to some
> random commit and blaming it. That commit has no impact on any
> existing devices/drivers.
>
> Hi Marc,
>
> Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> is used or until Anup actually works with us to debug the real issue.
>
> -Saravana
> [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/


Attachments:
(No filename) (2.39 kB)
signature.asc (235.00 B)
Download all attachments

2023-06-15 20:57:06

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Thu, Jun 15, 2023 at 12:31 PM Conor Dooley <[email protected]> wrote:
>
> Hey Saravana,
>
> On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
>
> btw, please try to delete the 100s of lines of unrelated context when
> replying

I always feel like some people like me to do this and others don't.
Also, at times, people might want to reference the other lines of code
when replying to my point. That's why I generally leave them in.

>
> > > +static int __init aplic_dt_init(struct device_node *node,
> > > + struct device_node *parent)
> > > +{
> > > + /*
> > > + * The APLIC platform driver needs to be probed early
> > > + * so for device tree:
> > > + *
> > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > + * provides a hint to the device driver core to probe the
> > > + * platform driver early.
> > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > + * of_irq_init() sets it which prevents creation of
> > > + * platform device.
> > > + */
> > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> >
> > NACK. You are blindly plastering flags without trying to understand
> > the real issue and fixing this correctly.
> >
> > > + of_node_clear_flag(node, OF_POPULATED);

Also, this part is not needed if the macros I mentioned below are used.

> > > + return 0;
> > > +}
> > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> >
> > This macro pretty much skips the entire driver core framework to probe
> > and calls init and you are supposed to initialize the device when the
> > init function is called.
> >
> > If you want your device/driver to follow the proper platform driver
> > path (which is recommended), then you need to use the
> > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> >
> > I offered to help you debug this issue and I asked for a dts file that
> > corresponds to a board you are testing this on and seeing an issue.
>
> There isn't a dts file for this because there's no publicly available
> hardware that actually has an APLIC. Maybe Ventana have pre-production
> silicon that has it, but otherwise it's a QEMU job.

1. QEMU example is fine too if it can be reproduced. I just asked for
a dts file because I need the full global view of the dependencies. At
a minimum, I'd at least expect to see some example DT and explanation
of what dependency is causing the IRQ device to not be initialized on
time, etc. Instead I just see random uses of flags with no description
of the actual issue.

2. If it's not a dts available upstream, why should these drivers be
accepted? I thought the norm was to only accept drivers that can
actually be used.

-Saravana

>
> Cheers,
> Conor.
>
> > But you haven't answered my question [1] and are pointing to some
> > random commit and blaming it. That commit has no impact on any
> > existing devices/drivers.
> >
> > Hi Marc,
> >
> > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > is used or until Anup actually works with us to debug the real issue.
> >
> > -Saravana
> > [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/

2023-06-15 21:44:52

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Thu, Jun 15, 2023 at 01:45:55PM -0700, Saravana Kannan wrote:
> On Thu, Jun 15, 2023 at 12:31 PM Conor Dooley <[email protected]> wrote:
> > On Thu, Jun 15, 2023 at 12:17:08PM -0700, Saravana Kannan wrote:
> > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> >
> > btw, please try to delete the 100s of lines of unrelated context when
> > replying
>
> I always feel like some people like me to do this and others don't.
> Also, at times, people might want to reference the other lines of code
> when replying to my point. That's why I generally leave them in.

Yah, perhaps I cull too aggressively but there's a middle ground ;)

> > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > + struct device_node *parent)
> > > > +{
> > > > + /*
> > > > + * The APLIC platform driver needs to be probed early
> > > > + * so for device tree:
> > > > + *
> > > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > + * provides a hint to the device driver core to probe the
> > > > + * platform driver early.
> > > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > > + * of_irq_init() sets it which prevents creation of
> > > > + * platform device.
> > > > + */
> > > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > >
> > > NACK. You are blindly plastering flags without trying to understand
> > > the real issue and fixing this correctly.
> > >
> > > > + of_node_clear_flag(node, OF_POPULATED);
>
> Also, this part is not needed if the macros I mentioned below are used.
>
> > > > + return 0;
> > > > +}
> > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > >
> > > This macro pretty much skips the entire driver core framework to probe
> > > and calls init and you are supposed to initialize the device when the
> > > init function is called.
> > >
> > > If you want your device/driver to follow the proper platform driver
> > > path (which is recommended), then you need to use the
> > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > >
> > > I offered to help you debug this issue and I asked for a dts file that
> > > corresponds to a board you are testing this on and seeing an issue.
> >
> > There isn't a dts file for this because there's no publicly available
> > hardware that actually has an APLIC. Maybe Ventana have pre-production
> > silicon that has it, but otherwise it's a QEMU job.
>
> 1. QEMU example is fine too if it can be reproduced. I just asked for
> a dts file because I need the full global view of the dependencies. At
> a minimum, I'd at least expect to see some example DT and explanation
> of what dependency is causing the IRQ device to not be initialized on
> time, etc. Instead I just see random uses of flags with no description
> of the actual issue.

It's Anup's responsibility to provide you with that information, I have
not reproduced this issue, so I won't mislead you with QEMU invocations
that may not be what's required to reproduce.

> 2. If it's not a dts available upstream, why should these drivers be
> accepted? I thought the norm was to only accept drivers that can
> actually be used.

I think it's not unusual (and desirable?) to start the upstreaming
process for stuff before hardware is publicly available, so that once it
is, support is already upstream, or close to. I do know that people have
tested this series in FPGA based hardware emulation platforms etc.
Posting patches for it also helps avoid duplication of effort between
the various vendors in RISC-V land, who would otherwise have to write
their own drivers. Also, the documented RISC-V policy for accepting
support for ISA stuff says:
We'll only accept patches for new modules or extensions if the
specifications for those modules or extensions are listed as being
unlikely to be incompatibly changed in the future. For
specifications from the RISC-V foundation this means "Frozen"
(Documentation/riscv/patch-acceptance.rst)
AIA (the spec behind the APLIC/IMSIC) is frozen, and qualifies from a
RISC-V point of view. What Marc is willing to accept, in terms of
pre-production hardware support, is up to him obviously!

Cheers,
Conor.

> > > But you haven't answered my question [1] and are pointing to some
> > > random commit and blaming it. That commit has no impact on any
> > > existing devices/drivers.
> > >
> > > Hi Marc,
> > >
> > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > is used or until Anup actually works with us to debug the real issue.
> > >
> > > -Saravana
> > > [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/


Attachments:
(No filename) (4.81 kB)
signature.asc (235.00 B)
Download all attachments

2023-06-16 02:26:49

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <[email protected]> wrote:
>
> On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> >
> > The RISC-V advanced interrupt architecture (AIA) specification defines
> > a new interrupt controller for managing wired interrupts on a RISC-V
> > platform. This new interrupt controller is referred to as advanced
> > platform-level interrupt controller (APLIC) which can forward wired
> > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > signaled interrupts.
> > (For more details refer https://github.com/riscv/riscv-aia)
> >
> > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > platforms.
> >
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > drivers/irqchip/Kconfig | 6 +
> > drivers/irqchip/Makefile | 1 +
> > drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> > include/linux/irqchip/riscv-aplic.h | 119 +++++
> > 4 files changed, 891 insertions(+)
> > create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > create mode 100644 include/linux/irqchip/riscv-aplic.h
> >
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index d700980372ef..834c0329f583 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > select IRQ_DOMAIN_HIERARCHY
> > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> >
> > +config RISCV_APLIC
> > + bool
> > + depends on RISCV
> > + select IRQ_DOMAIN_HIERARCHY
> > + select GENERIC_MSI_IRQ
> > +
> > config RISCV_IMSIC
> > bool
> > depends on RISCV
> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > index 577bde3e986b..438b8e1a152c 100644
> > --- a/drivers/irqchip/Makefile
> > +++ b/drivers/irqchip/Makefile
> > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > new file mode 100644
> > index 000000000000..1e710fdf5608
> > --- /dev/null
> > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > @@ -0,0 +1,765 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > +#include <linux/bitops.h>
> > +#include <linux/cpu.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/irq.h>
> > +#include <linux/irqchip.h>
> > +#include <linux/irqchip/chained_irq.h>
> > +#include <linux/irqchip/riscv-aplic.h>
> > +#include <linux/irqchip/riscv-imsic.h>
> > +#include <linux/irqdomain.h>
> > +#include <linux/module.h>
> > +#include <linux/msi.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/smp.h>
> > +
> > +#define APLIC_DEFAULT_PRIORITY 1
> > +#define APLIC_DISABLE_IDELIVERY 0
> > +#define APLIC_ENABLE_IDELIVERY 1
> > +#define APLIC_DISABLE_ITHRESHOLD 1
> > +#define APLIC_ENABLE_ITHRESHOLD 0
> > +
> > +struct aplic_msicfg {
> > + phys_addr_t base_ppn;
> > + u32 hhxs;
> > + u32 hhxw;
> > + u32 lhxs;
> > + u32 lhxw;
> > +};
> > +
> > +struct aplic_idc {
> > + unsigned int hart_index;
> > + void __iomem *regs;
> > + struct aplic_priv *priv;
> > +};
> > +
> > +struct aplic_priv {
> > + struct fwnode_handle *fwnode;
> > + u32 gsi_base;
> > + u32 nr_irqs;
> > + u32 nr_idcs;
> > + void __iomem *regs;
> > + struct irq_domain *irqdomain;
> > + struct aplic_msicfg msicfg;
> > + struct cpumask lmask;
> > +};
> > +
> > +static unsigned int aplic_idc_parent_irq;
> > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > +
> > +static void aplic_irq_unmask(struct irq_data *d)
> > +{
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > +
> > + if (!priv->nr_idcs)
> > + irq_chip_unmask_parent(d);
> > +}
> > +
> > +static void aplic_irq_mask(struct irq_data *d)
> > +{
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > +
> > + if (!priv->nr_idcs)
> > + irq_chip_mask_parent(d);
> > +}
> > +
> > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > +{
> > + u32 val = 0;
> > + void __iomem *sourcecfg;
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > + switch (type) {
> > + case IRQ_TYPE_NONE:
> > + val = APLIC_SOURCECFG_SM_INACTIVE;
> > + break;
> > + case IRQ_TYPE_LEVEL_LOW:
> > + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > + break;
> > + case IRQ_TYPE_LEVEL_HIGH:
> > + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > + break;
> > + case IRQ_TYPE_EDGE_FALLING:
> > + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > + break;
> > + case IRQ_TYPE_EDGE_RISING:
> > + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > + writel(val, sourcecfg);
> > +
> > + return 0;
> > +}
> > +
> > +static void aplic_irq_eoi(struct irq_data *d)
> > +{
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > + u32 reg_off, reg_mask;
> > +
> > + /*
> > + * EOI handling only required only for level-triggered
> > + * interrupts in APLIC MSI mode.
> > + */
> > +
> > + if (priv->nr_idcs)
> > + return;
> > +
> > + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > + switch (irqd_get_trigger_type(d)) {
> > + case IRQ_TYPE_LEVEL_LOW:
> > + if (!(readl(priv->regs + reg_off) & reg_mask))
> > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > + break;
> > + case IRQ_TYPE_LEVEL_HIGH:
> > + if (readl(priv->regs + reg_off) & reg_mask)
> > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > + break;
> > + }
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static int aplic_set_affinity(struct irq_data *d,
> > + const struct cpumask *mask_val, bool force)
> > +{
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > + struct aplic_idc *idc;
> > + unsigned int cpu, val;
> > + struct cpumask amask;
> > + void __iomem *target;
> > +
> > + if (!priv->nr_idcs)
> > + return irq_chip_set_affinity_parent(d, mask_val, force);
> > +
> > + cpumask_and(&amask, &priv->lmask, mask_val);
> > +
> > + if (force)
> > + cpu = cpumask_first(&amask);
> > + else
> > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > +
> > + if (cpu >= nr_cpu_ids)
> > + return -EINVAL;
> > +
> > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > + target = priv->regs + APLIC_TARGET_BASE;
> > + target += (d->hwirq - 1) * sizeof(u32);
> > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > + val |= APLIC_DEFAULT_PRIORITY;
> > + writel(val, target);
> > +
> > + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +
> > + return IRQ_SET_MASK_OK_DONE;
> > +}
> > +#endif
> > +
> > +static struct irq_chip aplic_chip = {
> > + .name = "RISC-V APLIC",
> > + .irq_mask = aplic_irq_mask,
> > + .irq_unmask = aplic_irq_unmask,
> > + .irq_set_type = aplic_set_type,
> > + .irq_eoi = aplic_irq_eoi,
> > +#ifdef CONFIG_SMP
> > + .irq_set_affinity = aplic_set_affinity,
> > +#endif
> > + .flags = IRQCHIP_SET_TYPE_MASKED |
> > + IRQCHIP_SKIP_SET_WAKE |
> > + IRQCHIP_MASK_ON_SUSPEND,
> > +};
> > +
> > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > + u32 gsi_base,
> > + unsigned long *hwirq,
> > + unsigned int *type)
> > +{
> > + if (WARN_ON(fwspec->param_count < 2))
> > + return -EINVAL;
> > + if (WARN_ON(!fwspec->param[0]))
> > + return -EINVAL;
> > +
> > + /* For DT, gsi_base is always zero. */
> > + *hwirq = fwspec->param[0] - gsi_base;
> > + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > +
> > + WARN_ON(*type == IRQ_TYPE_NONE);
> > +
> > + return 0;
> > +}
> > +
> > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > + struct irq_fwspec *fwspec,
> > + unsigned long *hwirq,
> > + unsigned int *type)
> > +{
> > + struct aplic_priv *priv = platform_msi_get_host_data(d);
> > +
> > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > +}
> > +
> > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > + unsigned int virq, unsigned int nr_irqs,
> > + void *arg)
> > +{
> > + int i, ret;
> > + unsigned int type;
> > + irq_hw_number_t hwirq;
> > + struct irq_fwspec *fwspec = arg;
> > + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > +
> > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > + if (ret)
> > + return ret;
> > +
> > + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > + if (ret)
> > + return ret;
> > +
> > + for (i = 0; i < nr_irqs; i++) {
> > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > + &aplic_chip, priv, handle_fasteoi_irq,
> > + NULL, NULL);
> > + /*
> > + * APLIC does not implement irq_disable() so Linux interrupt
> > + * subsystem will take a lazy approach for disabling an APLIC
> > + * interrupt. This means APLIC interrupts are left unmasked
> > + * upon system suspend and interrupts are not processed
> > + * immediately upon system wake up. To tackle this, we disable
> > + * the lazy approach for all APLIC interrupts.
> > + */
> > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > + .translate = aplic_irqdomain_msi_translate,
> > + .alloc = aplic_irqdomain_msi_alloc,
> > + .free = platform_msi_device_domain_free,
> > +};
> > +
> > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > + struct irq_fwspec *fwspec,
> > + unsigned long *hwirq,
> > + unsigned int *type)
> > +{
> > + struct aplic_priv *priv = d->host_data;
> > +
> > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > +}
> > +
> > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > + unsigned int virq, unsigned int nr_irqs,
> > + void *arg)
> > +{
> > + int i, ret;
> > + unsigned int type;
> > + irq_hw_number_t hwirq;
> > + struct irq_fwspec *fwspec = arg;
> > + struct aplic_priv *priv = domain->host_data;
> > +
> > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > + if (ret)
> > + return ret;
> > +
> > + for (i = 0; i < nr_irqs; i++) {
> > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > + &aplic_chip, priv, handle_fasteoi_irq,
> > + NULL, NULL);
> > + irq_set_affinity(virq + i, &priv->lmask);
> > + /* See the reason described in aplic_irqdomain_msi_alloc() */
> > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > + .translate = aplic_irqdomain_idc_translate,
> > + .alloc = aplic_irqdomain_idc_alloc,
> > + .free = irq_domain_free_irqs_top,
> > +};
> > +
> > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > +{
> > + int i;
> > +
> > + /* Disable all interrupts */
> > + for (i = 0; i <= priv->nr_irqs; i += 32)
> > + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > + (i / 32) * sizeof(u32));
> > +
> > + /* Set interrupt type and default priority for all interrupts */
> > + for (i = 1; i <= priv->nr_irqs; i++) {
> > + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > + (i - 1) * sizeof(u32));
> > + writel(APLIC_DEFAULT_PRIORITY,
> > + priv->regs + APLIC_TARGET_BASE +
> > + (i - 1) * sizeof(u32));
> > + }
> > +
> > + /* Clear APLIC domaincfg */
> > + writel(0, priv->regs + APLIC_DOMAINCFG);
> > +}
> > +
> > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > +{
> > + u32 val;
> > +#ifdef CONFIG_RISCV_M_MODE
> > + u32 valH;
> > +
> > + if (!priv->nr_idcs) {
> > + val = priv->msicfg.base_ppn;
> > + valH = (priv->msicfg.base_ppn >> 32) &
> > + APLIC_xMSICFGADDRH_BAPPN_MASK;
> > + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > + }
> > +#endif
> > +
> > + /* Setup APLIC domaincfg register */
> > + val = readl(priv->regs + APLIC_DOMAINCFG);
> > + val |= APLIC_DOMAINCFG_IE;
> > + if (!priv->nr_idcs)
> > + val |= APLIC_DOMAINCFG_DM;
> > + writel(val, priv->regs + APLIC_DOMAINCFG);
> > + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > + priv->fwnode, val);
> > +}
> > +
> > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > +{
> > + unsigned int group_index, hart_index, guest_index, val;
> > + struct irq_data *d = irq_get_irq_data(desc->irq);
> > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > + struct aplic_msicfg *mc = &priv->msicfg;
> > + phys_addr_t tppn, tbppn, msg_addr;
> > + void __iomem *target;
> > +
> > + /* For zeroed MSI, simply write zero into the target register */
> > + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > + target = priv->regs + APLIC_TARGET_BASE;
> > + target += (d->hwirq - 1) * sizeof(u32);
> > + writel(0, target);
> > + return;
> > + }
> > +
> > + /* Sanity check on message data */
> > + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > +
> > + /* Compute target MSI address */
> > + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > +
> > + /* Compute target HART Base PPN */
> > + tbppn = tppn;
> > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > + WARN_ON(tbppn != mc->base_ppn);
> > +
> > + /* Compute target group and hart indexes */
> > + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > + hart_index |= (group_index << mc->lhxw);
> > + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > +
> > + /* Compute target guest index */
> > + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > +
> > + /* Update IRQ TARGET register */
> > + target = priv->regs + APLIC_TARGET_BASE;
> > + target += (d->hwirq - 1) * sizeof(u32);
> > + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > + << APLIC_TARGET_HART_IDX_SHIFT;
> > + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > + << APLIC_TARGET_GUEST_IDX_SHIFT;
> > + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > + writel(val, target);
> > +}
> > +
> > +static int aplic_setup_msi(struct aplic_priv *priv)
> > +{
> > + struct aplic_msicfg *mc = &priv->msicfg;
> > + const struct imsic_global_config *imsic_global;
> > +
> > + /*
> > + * The APLIC outgoing MSI config registers assume target MSI
> > + * controller to be RISC-V AIA IMSIC controller.
> > + */
> > + imsic_global = imsic_get_global_config();
> > + if (!imsic_global) {
> > + pr_err("%pfwP: IMSIC global config not found\n",
> > + priv->fwnode);
> > + return -ENODEV;
> > + }
> > +
> > + /* Find number of guest index bits (LHXS) */
> > + mc->lhxs = imsic_global->guest_index_bits;
> > + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > + priv->fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of HART index bits (LHXW) */
> > + mc->lhxw = imsic_global->hart_index_bits;
> > + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > + priv->fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find number of group index bits (HHXW) */
> > + mc->hhxw = imsic_global->group_index_bits;
> > + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > + priv->fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Find first bit position of group index (HHXS) */
> > + mc->hhxs = imsic_global->group_index_shift;
> > + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > + return -EINVAL;
> > + }
> > + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > + priv->fwnode);
> > + return -EINVAL;
> > + }
> > +
> > + /* Compute PPN base */
> > + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > +
> > + /* Use all possible CPUs as lmask */
> > + cpumask_copy(&priv->lmask, cpu_possible_mask);
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > + * which will return highest priority pending interrupt and clear the
> > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > + * register return zero value.
> > + */
> > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > +{
> > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > + irq_hw_number_t hw_irq;
> > + int irq;
> > +
> > + chained_irq_enter(chip, desc);
> > +
> > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > +
> > + if (unlikely(irq <= 0))
> > + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > + hw_irq);
> > + else
> > + generic_handle_irq(irq);
> > + }
> > +
> > + chained_irq_exit(chip, desc);
> > +}
> > +
> > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > +{
> > + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > +
> > + /* Priority must be less than threshold for interrupt triggering */
> > + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > +
> > + /* Delivery must be set to 1 for interrupt triggering */
> > + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > +}
> > +
> > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > +{
> > + if (aplic_idc_parent_irq)
> > + disable_percpu_irq(aplic_idc_parent_irq);
> > +
> > + return 0;
> > +}
> > +
> > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > +{
> > + if (aplic_idc_parent_irq)
> > + enable_percpu_irq(aplic_idc_parent_irq,
> > + irq_get_trigger_type(aplic_idc_parent_irq));
> > +
> > + return 0;
> > +}
> > +
> > +static int aplic_setup_idc(struct aplic_priv *priv)
> > +{
> > + int i, j, rc, cpu, setup_count = 0;
> > + struct fwnode_reference_args parent;
> > + struct irq_domain *domain;
> > + unsigned long hartid;
> > + struct aplic_idc *idc;
> > + u32 val;
> > +
> > + /* Setup per-CPU IDC and target CPU mask */
> > + for (i = 0; i < priv->nr_idcs; i++) {
> > + rc = fwnode_property_get_reference_args(priv->fwnode,
> > + "interrupts-extended", "#interrupt-cells",
> > + 0, i, &parent);
> > + if (rc) {
> > + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > + priv->fwnode, i);
> > + continue;
> > + }
> > +
> > + /*
> > + * Skip interrupts other than external interrupts for
> > + * current privilege level.
> > + */
> > + if (parent.args[0] != RV_IRQ_EXT)
> > + continue;
> > +
> > + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > + if (rc) {
> > + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > + priv->fwnode, i);
> > + continue;
> > + }
> > +
> > + cpu = riscv_hartid_to_cpuid(hartid);
> > + if (cpu < 0) {
> > + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > + priv->fwnode, i);
> > + continue;
> > + }
> > +
> > + cpumask_set_cpu(cpu, &priv->lmask);
> > +
> > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > + idc->hart_index = i;
> > + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > + idc->priv = priv;
> > +
> > + aplic_idc_set_delivery(idc, true);
> > +
> > + /*
> > + * Boot cpu might not have APLIC hart_index = 0 so check
> > + * and update target registers of all interrupts.
> > + */
> > + if (cpu == smp_processor_id() && idc->hart_index) {
> > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > + val |= APLIC_DEFAULT_PRIORITY;
> > + for (j = 1; j <= priv->nr_irqs; j++)
> > + writel(val, priv->regs + APLIC_TARGET_BASE +
> > + (j - 1) * sizeof(u32));
> > + }
> > +
> > + setup_count++;
> > + }
> > +
> > + /* Find parent domain and register chained handler */
> > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > + DOMAIN_BUS_ANY);
> > + if (!aplic_idc_parent_irq && domain) {
> > + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > + if (aplic_idc_parent_irq) {
> > + irq_set_chained_handler(aplic_idc_parent_irq,
> > + aplic_idc_handle_irq);
> > +
> > + /*
> > + * Setup CPUHP notifier to enable IDC parent
> > + * interrupt on all CPUs
> > + */
> > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > + "irqchip/riscv/aplic:starting",
> > + aplic_idc_starting_cpu,
> > + aplic_idc_dying_cpu);
> > + }
> > + }
> > +
> > + /* Fail if we were not able to setup IDC for any CPU */
> > + return (setup_count) ? 0 : -ENODEV;
> > +}
> > +
> > +static int aplic_probe(struct platform_device *pdev)
> > +{
> > + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > + struct fwnode_reference_args parent;
> > + struct aplic_priv *priv;
> > + struct resource *res;
> > + phys_addr_t pa;
> > + int rc;
> > +
> > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > + if (!priv)
> > + return -ENOMEM;
> > + priv->fwnode = fwnode;
> > +
> > + /* Map the MMIO registers */
> > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (!res) {
> > + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > + return -EINVAL;
> > + }
> > + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > + if (!priv->regs) {
> > + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > + return -ENOMEM;
> > + }
> > +
> > + /*
> > + * Find out GSI base number
> > + *
> > + * Note: DT does not define "riscv,gsi-base" property so GSI
> > + * base is always zero for DT.
> > + */
> > + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > + &priv->gsi_base, 1);
> > + if (rc)
> > + priv->gsi_base = 0;
> > +
> > + /* Find out number of interrupt sources */
> > + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > + &priv->nr_irqs, 1);
> > + if (rc) {
> > + pr_err("%pfwP: failed to get number of interrupt sources\n",
> > + fwnode);
> > + return rc;
> > + }
> > +
> > + /* Setup initial state APLIC interrupts */
> > + aplic_init_hw_irqs(priv);
> > +
> > + /*
> > + * Find out number of IDCs based on parent interrupts
> > + *
> > + * If "msi-parent" property is present then we ignore the
> > + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > + */
> > + if (!fwnode_property_present(fwnode, "msi-parent")) {
> > + while (!fwnode_property_get_reference_args(fwnode,
> > + "interrupts-extended", "#interrupt-cells",
> > + 0, priv->nr_idcs, &parent))
> > + priv->nr_idcs++;
> > + }
> > +
> > + /* Setup IDCs or MSIs based on number of IDCs */
> > + if (priv->nr_idcs)
> > + rc = aplic_setup_idc(priv);
> > + else
> > + rc = aplic_setup_msi(priv);
> > + if (rc) {
> > + pr_err("%pfwP: failed setup %s\n",
> > + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > + return rc;
> > + }
> > +
> > + /* Setup global config and interrupt delivery */
> > + aplic_init_hw_global(priv);
> > +
> > + /* Create irq domain instance for the APLIC */
> > + if (priv->nr_idcs)
> > + priv->irqdomain = irq_domain_create_linear(
> > + priv->fwnode,
> > + priv->nr_irqs + 1,
> > + &aplic_irqdomain_idc_ops,
> > + priv);
> > + else
> > + priv->irqdomain = platform_msi_create_device_domain(
> > + &pdev->dev,
> > + priv->nr_irqs + 1,
> > + aplic_msi_write_msg,
> > + &aplic_irqdomain_msi_ops,
> > + priv);
> > + if (!priv->irqdomain) {
> > + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > + return -ENOMEM;
> > + }
> > +
> > + /* Advertise the interrupt controller */
> > + if (priv->nr_idcs) {
> > + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > + } else {
> > + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > + priv->fwnode, priv->nr_irqs, &pa);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id aplic_match[] = {
> > + { .compatible = "riscv,aplic" },
> > + {}
> > +};
> > +
> > +static struct platform_driver aplic_driver = {
> > + .driver = {
> > + .name = "riscv-aplic",
> > + .of_match_table = aplic_match,
> > + },
> > + .probe = aplic_probe,
> > +};
> > +builtin_platform_driver(aplic_driver);
> > +
> > +static int __init aplic_dt_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + /*
> > + * The APLIC platform driver needs to be probed early
> > + * so for device tree:
> > + *
> > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > + * provides a hint to the device driver core to probe the
> > + * platform driver early.
> > + * 2) Clear the OF_POPULATED flag in device_node because
> > + * of_irq_init() sets it which prevents creation of
> > + * platform device.
> > + */
> > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
>
> NACK. You are blindly plastering flags without trying to understand
> the real issue and fixing this correctly.
>
> > + of_node_clear_flag(node, OF_POPULATED);
> > + return 0;
> > +}
> > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
>
> This macro pretty much skips the entire driver core framework to probe
> and calls init and you are supposed to initialize the device when the
> init function is called.
>
> If you want your device/driver to follow the proper platform driver
> path (which is recommended), then you need to use the
> IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
>
> I offered to help you debug this issue and I asked for a dts file that
> corresponds to a board you are testing this on and seeing an issue.
> But you haven't answered my question [1] and are pointing to some
> random commit and blaming it. That commit has no impact on any
> existing devices/drivers.
>
> Hi Marc,
>
> Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> is used or until Anup actually works with us to debug the real issue.

Maybe I misread your previous comment.

You can easily reproduce the issue on QEMU virt machine for RISC-V:
1) Build qemu-system-riscv64 from latest QEMU master
2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
(Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
APLIC driver at the time of building kernel)
3) Boot a APLIC-only system on QEMU virt machine
qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
-bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
-kernel ./build-riscv64/arch/riscv/boot/Image \
-append "root=/dev/ram rw console=ttyS0 earlycon" \
-initrd ./rootfs_riscv64.img

I hope the above steps help you reproduce the issue. I will certainly
test whatever fix you propose.

Regards,
Anup


>
> -Saravana
> [1] - https://lore.kernel.org/lkml/CAAhSdy2p6K70fc2yZLPdVGqEq61Y8F7WVT2J8st5mQrzBi4WHg@mail.gmail.com/
>
>
> > diff --git a/include/linux/irqchip/riscv-aplic.h b/include/linux/irqchip/riscv-aplic.h
> > new file mode 100644
> > index 000000000000..97e198ea0109
> > --- /dev/null
> > +++ b/include/linux/irqchip/riscv-aplic.h
> > @@ -0,0 +1,119 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +#ifndef __LINUX_IRQCHIP_RISCV_APLIC_H
> > +#define __LINUX_IRQCHIP_RISCV_APLIC_H
> > +
> > +#include <linux/bitops.h>
> > +
> > +#define APLIC_MAX_IDC BIT(14)
> > +#define APLIC_MAX_SOURCE 1024
> > +
> > +#define APLIC_DOMAINCFG 0x0000
> > +#define APLIC_DOMAINCFG_RDONLY 0x80000000
> > +#define APLIC_DOMAINCFG_IE BIT(8)
> > +#define APLIC_DOMAINCFG_DM BIT(2)
> > +#define APLIC_DOMAINCFG_BE BIT(0)
> > +
> > +#define APLIC_SOURCECFG_BASE 0x0004
> > +#define APLIC_SOURCECFG_D BIT(10)
> > +#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
> > +#define APLIC_SOURCECFG_SM_MASK 0x00000007
> > +#define APLIC_SOURCECFG_SM_INACTIVE 0x0
> > +#define APLIC_SOURCECFG_SM_DETACH 0x1
> > +#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
> > +#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
> > +#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
> > +#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
> > +
> > +#define APLIC_MMSICFGADDR 0x1bc0
> > +#define APLIC_MMSICFGADDRH 0x1bc4
> > +#define APLIC_SMSICFGADDR 0x1bc8
> > +#define APLIC_SMSICFGADDRH 0x1bcc
> > +
> > +#ifdef CONFIG_RISCV_M_MODE
> > +#define APLIC_xMSICFGADDR APLIC_MMSICFGADDR
> > +#define APLIC_xMSICFGADDRH APLIC_MMSICFGADDRH
> > +#else
> > +#define APLIC_xMSICFGADDR APLIC_SMSICFGADDR
> > +#define APLIC_xMSICFGADDRH APLIC_SMSICFGADDRH
> > +#endif
> > +
> > +#define APLIC_xMSICFGADDRH_L BIT(31)
> > +#define APLIC_xMSICFGADDRH_HHXS_MASK 0x1f
> > +#define APLIC_xMSICFGADDRH_HHXS_SHIFT 24
> > +#define APLIC_xMSICFGADDRH_LHXS_MASK 0x7
> > +#define APLIC_xMSICFGADDRH_LHXS_SHIFT 20
> > +#define APLIC_xMSICFGADDRH_HHXW_MASK 0x7
> > +#define APLIC_xMSICFGADDRH_HHXW_SHIFT 16
> > +#define APLIC_xMSICFGADDRH_LHXW_MASK 0xf
> > +#define APLIC_xMSICFGADDRH_LHXW_SHIFT 12
> > +#define APLIC_xMSICFGADDRH_BAPPN_MASK 0xfff
> > +
> > +#define APLIC_xMSICFGADDR_PPN_SHIFT 12
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HART(__lhxs) \
> > + (BIT(__lhxs) - 1)
> > +
> > +#define APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) \
> > + (BIT(__lhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs) \
> > + ((__lhxs))
> > +#define APLIC_xMSICFGADDR_PPN_LHX(__lhxw, __lhxs) \
> > + (APLIC_xMSICFGADDR_PPN_LHX_MASK(__lhxw) << \
> > + APLIC_xMSICFGADDR_PPN_LHX_SHIFT(__lhxs))
> > +
> > +#define APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) \
> > + (BIT(__hhxw) - 1)
> > +#define APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs) \
> > + ((__hhxs) + APLIC_xMSICFGADDR_PPN_SHIFT)
> > +#define APLIC_xMSICFGADDR_PPN_HHX(__hhxw, __hhxs) \
> > + (APLIC_xMSICFGADDR_PPN_HHX_MASK(__hhxw) << \
> > + APLIC_xMSICFGADDR_PPN_HHX_SHIFT(__hhxs))
> > +
> > +#define APLIC_IRQBITS_PER_REG 32
> > +
> > +#define APLIC_SETIP_BASE 0x1c00
> > +#define APLIC_SETIPNUM 0x1cdc
> > +
> > +#define APLIC_CLRIP_BASE 0x1d00
> > +#define APLIC_CLRIPNUM 0x1ddc
> > +
> > +#define APLIC_SETIE_BASE 0x1e00
> > +#define APLIC_SETIENUM 0x1edc
> > +
> > +#define APLIC_CLRIE_BASE 0x1f00
> > +#define APLIC_CLRIENUM 0x1fdc
> > +
> > +#define APLIC_SETIPNUM_LE 0x2000
> > +#define APLIC_SETIPNUM_BE 0x2004
> > +
> > +#define APLIC_GENMSI 0x3000
> > +
> > +#define APLIC_TARGET_BASE 0x3004
> > +#define APLIC_TARGET_HART_IDX_SHIFT 18
> > +#define APLIC_TARGET_HART_IDX_MASK 0x3fff
> > +#define APLIC_TARGET_GUEST_IDX_SHIFT 12
> > +#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
> > +#define APLIC_TARGET_IPRIO_MASK 0xff
> > +#define APLIC_TARGET_EIID_MASK 0x7ff
> > +
> > +#define APLIC_IDC_BASE 0x4000
> > +#define APLIC_IDC_SIZE 32
> > +
> > +#define APLIC_IDC_IDELIVERY 0x00
> > +
> > +#define APLIC_IDC_IFORCE 0x04
> > +
> > +#define APLIC_IDC_ITHRESHOLD 0x08
> > +
> > +#define APLIC_IDC_TOPI 0x18
> > +#define APLIC_IDC_TOPI_ID_SHIFT 16
> > +#define APLIC_IDC_TOPI_ID_MASK 0x3ff
> > +#define APLIC_IDC_TOPI_PRIO_MASK 0xff
> > +
> > +#define APLIC_IDC_CLAIMI 0x1c
> > +
> > +#endif
> > --
> > 2.34.1
> >

2023-06-16 22:15:26

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <[email protected]> wrote:
>
> On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <[email protected]> wrote:
> >
> > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> > >
> > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > platform. This new interrupt controller is referred to as advanced
> > > platform-level interrupt controller (APLIC) which can forward wired
> > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > signaled interrupts.
> > > (For more details refer https://github.com/riscv/riscv-aia)
> > >
> > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > platforms.
> > >
> > > Signed-off-by: Anup Patel <[email protected]>
> > > ---
> > > drivers/irqchip/Kconfig | 6 +
> > > drivers/irqchip/Makefile | 1 +
> > > drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> > > include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > 4 files changed, 891 insertions(+)
> > > create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > create mode 100644 include/linux/irqchip/riscv-aplic.h
> > >
> > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > index d700980372ef..834c0329f583 100644
> > > --- a/drivers/irqchip/Kconfig
> > > +++ b/drivers/irqchip/Kconfig
> > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > select IRQ_DOMAIN_HIERARCHY
> > > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > >
> > > +config RISCV_APLIC
> > > + bool
> > > + depends on RISCV
> > > + select IRQ_DOMAIN_HIERARCHY
> > > + select GENERIC_MSI_IRQ
> > > +
> > > config RISCV_IMSIC
> > > bool
> > > depends on RISCV
> > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > index 577bde3e986b..438b8e1a152c 100644
> > > --- a/drivers/irqchip/Makefile
> > > +++ b/drivers/irqchip/Makefile
> > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > > +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> > > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > new file mode 100644
> > > index 000000000000..1e710fdf5608
> > > --- /dev/null
> > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > @@ -0,0 +1,765 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > +#include <linux/bitops.h>
> > > +#include <linux/cpu.h>
> > > +#include <linux/interrupt.h>
> > > +#include <linux/io.h>
> > > +#include <linux/irq.h>
> > > +#include <linux/irqchip.h>
> > > +#include <linux/irqchip/chained_irq.h>
> > > +#include <linux/irqchip/riscv-aplic.h>
> > > +#include <linux/irqchip/riscv-imsic.h>
> > > +#include <linux/irqdomain.h>
> > > +#include <linux/module.h>
> > > +#include <linux/msi.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/smp.h>
> > > +
> > > +#define APLIC_DEFAULT_PRIORITY 1
> > > +#define APLIC_DISABLE_IDELIVERY 0
> > > +#define APLIC_ENABLE_IDELIVERY 1
> > > +#define APLIC_DISABLE_ITHRESHOLD 1
> > > +#define APLIC_ENABLE_ITHRESHOLD 0
> > > +
> > > +struct aplic_msicfg {
> > > + phys_addr_t base_ppn;
> > > + u32 hhxs;
> > > + u32 hhxw;
> > > + u32 lhxs;
> > > + u32 lhxw;
> > > +};
> > > +
> > > +struct aplic_idc {
> > > + unsigned int hart_index;
> > > + void __iomem *regs;
> > > + struct aplic_priv *priv;
> > > +};
> > > +
> > > +struct aplic_priv {
> > > + struct fwnode_handle *fwnode;
> > > + u32 gsi_base;
> > > + u32 nr_irqs;
> > > + u32 nr_idcs;
> > > + void __iomem *regs;
> > > + struct irq_domain *irqdomain;
> > > + struct aplic_msicfg msicfg;
> > > + struct cpumask lmask;
> > > +};
> > > +
> > > +static unsigned int aplic_idc_parent_irq;
> > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > +
> > > +static void aplic_irq_unmask(struct irq_data *d)
> > > +{
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > +
> > > + if (!priv->nr_idcs)
> > > + irq_chip_unmask_parent(d);
> > > +}
> > > +
> > > +static void aplic_irq_mask(struct irq_data *d)
> > > +{
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > +
> > > + if (!priv->nr_idcs)
> > > + irq_chip_mask_parent(d);
> > > +}
> > > +
> > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > +{
> > > + u32 val = 0;
> > > + void __iomem *sourcecfg;
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > +
> > > + switch (type) {
> > > + case IRQ_TYPE_NONE:
> > > + val = APLIC_SOURCECFG_SM_INACTIVE;
> > > + break;
> > > + case IRQ_TYPE_LEVEL_LOW:
> > > + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > + break;
> > > + case IRQ_TYPE_LEVEL_HIGH:
> > > + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > + break;
> > > + case IRQ_TYPE_EDGE_FALLING:
> > > + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > + break;
> > > + case IRQ_TYPE_EDGE_RISING:
> > > + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > + break;
> > > + default:
> > > + return -EINVAL;
> > > + }
> > > +
> > > + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > + writel(val, sourcecfg);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static void aplic_irq_eoi(struct irq_data *d)
> > > +{
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > + u32 reg_off, reg_mask;
> > > +
> > > + /*
> > > + * EOI handling only required only for level-triggered
> > > + * interrupts in APLIC MSI mode.
> > > + */
> > > +
> > > + if (priv->nr_idcs)
> > > + return;
> > > +
> > > + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > + switch (irqd_get_trigger_type(d)) {
> > > + case IRQ_TYPE_LEVEL_LOW:
> > > + if (!(readl(priv->regs + reg_off) & reg_mask))
> > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > + break;
> > > + case IRQ_TYPE_LEVEL_HIGH:
> > > + if (readl(priv->regs + reg_off) & reg_mask)
> > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > + break;
> > > + }
> > > +}
> > > +
> > > +#ifdef CONFIG_SMP
> > > +static int aplic_set_affinity(struct irq_data *d,
> > > + const struct cpumask *mask_val, bool force)
> > > +{
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > + struct aplic_idc *idc;
> > > + unsigned int cpu, val;
> > > + struct cpumask amask;
> > > + void __iomem *target;
> > > +
> > > + if (!priv->nr_idcs)
> > > + return irq_chip_set_affinity_parent(d, mask_val, force);
> > > +
> > > + cpumask_and(&amask, &priv->lmask, mask_val);
> > > +
> > > + if (force)
> > > + cpu = cpumask_first(&amask);
> > > + else
> > > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > +
> > > + if (cpu >= nr_cpu_ids)
> > > + return -EINVAL;
> > > +
> > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > + target = priv->regs + APLIC_TARGET_BASE;
> > > + target += (d->hwirq - 1) * sizeof(u32);
> > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > + val |= APLIC_DEFAULT_PRIORITY;
> > > + writel(val, target);
> > > +
> > > + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > +
> > > + return IRQ_SET_MASK_OK_DONE;
> > > +}
> > > +#endif
> > > +
> > > +static struct irq_chip aplic_chip = {
> > > + .name = "RISC-V APLIC",
> > > + .irq_mask = aplic_irq_mask,
> > > + .irq_unmask = aplic_irq_unmask,
> > > + .irq_set_type = aplic_set_type,
> > > + .irq_eoi = aplic_irq_eoi,
> > > +#ifdef CONFIG_SMP
> > > + .irq_set_affinity = aplic_set_affinity,
> > > +#endif
> > > + .flags = IRQCHIP_SET_TYPE_MASKED |
> > > + IRQCHIP_SKIP_SET_WAKE |
> > > + IRQCHIP_MASK_ON_SUSPEND,
> > > +};
> > > +
> > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > + u32 gsi_base,
> > > + unsigned long *hwirq,
> > > + unsigned int *type)
> > > +{
> > > + if (WARN_ON(fwspec->param_count < 2))
> > > + return -EINVAL;
> > > + if (WARN_ON(!fwspec->param[0]))
> > > + return -EINVAL;
> > > +
> > > + /* For DT, gsi_base is always zero. */
> > > + *hwirq = fwspec->param[0] - gsi_base;
> > > + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > +
> > > + WARN_ON(*type == IRQ_TYPE_NONE);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > + struct irq_fwspec *fwspec,
> > > + unsigned long *hwirq,
> > > + unsigned int *type)
> > > +{
> > > + struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > +
> > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > +}
> > > +
> > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > + unsigned int virq, unsigned int nr_irqs,
> > > + void *arg)
> > > +{
> > > + int i, ret;
> > > + unsigned int type;
> > > + irq_hw_number_t hwirq;
> > > + struct irq_fwspec *fwspec = arg;
> > > + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > +
> > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + for (i = 0; i < nr_irqs; i++) {
> > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > + NULL, NULL);
> > > + /*
> > > + * APLIC does not implement irq_disable() so Linux interrupt
> > > + * subsystem will take a lazy approach for disabling an APLIC
> > > + * interrupt. This means APLIC interrupts are left unmasked
> > > + * upon system suspend and interrupts are not processed
> > > + * immediately upon system wake up. To tackle this, we disable
> > > + * the lazy approach for all APLIC interrupts.
> > > + */
> > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > + .translate = aplic_irqdomain_msi_translate,
> > > + .alloc = aplic_irqdomain_msi_alloc,
> > > + .free = platform_msi_device_domain_free,
> > > +};
> > > +
> > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > + struct irq_fwspec *fwspec,
> > > + unsigned long *hwirq,
> > > + unsigned int *type)
> > > +{
> > > + struct aplic_priv *priv = d->host_data;
> > > +
> > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > +}
> > > +
> > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > + unsigned int virq, unsigned int nr_irqs,
> > > + void *arg)
> > > +{
> > > + int i, ret;
> > > + unsigned int type;
> > > + irq_hw_number_t hwirq;
> > > + struct irq_fwspec *fwspec = arg;
> > > + struct aplic_priv *priv = domain->host_data;
> > > +
> > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + for (i = 0; i < nr_irqs; i++) {
> > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > + NULL, NULL);
> > > + irq_set_affinity(virq + i, &priv->lmask);
> > > + /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > + .translate = aplic_irqdomain_idc_translate,
> > > + .alloc = aplic_irqdomain_idc_alloc,
> > > + .free = irq_domain_free_irqs_top,
> > > +};
> > > +
> > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > +{
> > > + int i;
> > > +
> > > + /* Disable all interrupts */
> > > + for (i = 0; i <= priv->nr_irqs; i += 32)
> > > + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > + (i / 32) * sizeof(u32));
> > > +
> > > + /* Set interrupt type and default priority for all interrupts */
> > > + for (i = 1; i <= priv->nr_irqs; i++) {
> > > + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > + (i - 1) * sizeof(u32));
> > > + writel(APLIC_DEFAULT_PRIORITY,
> > > + priv->regs + APLIC_TARGET_BASE +
> > > + (i - 1) * sizeof(u32));
> > > + }
> > > +
> > > + /* Clear APLIC domaincfg */
> > > + writel(0, priv->regs + APLIC_DOMAINCFG);
> > > +}
> > > +
> > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > +{
> > > + u32 val;
> > > +#ifdef CONFIG_RISCV_M_MODE
> > > + u32 valH;
> > > +
> > > + if (!priv->nr_idcs) {
> > > + val = priv->msicfg.base_ppn;
> > > + valH = (priv->msicfg.base_ppn >> 32) &
> > > + APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > + }
> > > +#endif
> > > +
> > > + /* Setup APLIC domaincfg register */
> > > + val = readl(priv->regs + APLIC_DOMAINCFG);
> > > + val |= APLIC_DOMAINCFG_IE;
> > > + if (!priv->nr_idcs)
> > > + val |= APLIC_DOMAINCFG_DM;
> > > + writel(val, priv->regs + APLIC_DOMAINCFG);
> > > + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > + priv->fwnode, val);
> > > +}
> > > +
> > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > +{
> > > + unsigned int group_index, hart_index, guest_index, val;
> > > + struct irq_data *d = irq_get_irq_data(desc->irq);
> > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > + phys_addr_t tppn, tbppn, msg_addr;
> > > + void __iomem *target;
> > > +
> > > + /* For zeroed MSI, simply write zero into the target register */
> > > + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > + target = priv->regs + APLIC_TARGET_BASE;
> > > + target += (d->hwirq - 1) * sizeof(u32);
> > > + writel(0, target);
> > > + return;
> > > + }
> > > +
> > > + /* Sanity check on message data */
> > > + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > +
> > > + /* Compute target MSI address */
> > > + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > +
> > > + /* Compute target HART Base PPN */
> > > + tbppn = tppn;
> > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > + WARN_ON(tbppn != mc->base_ppn);
> > > +
> > > + /* Compute target group and hart indexes */
> > > + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > + hart_index |= (group_index << mc->lhxw);
> > > + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > +
> > > + /* Compute target guest index */
> > > + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > +
> > > + /* Update IRQ TARGET register */
> > > + target = priv->regs + APLIC_TARGET_BASE;
> > > + target += (d->hwirq - 1) * sizeof(u32);
> > > + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > + << APLIC_TARGET_HART_IDX_SHIFT;
> > > + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > + << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > + writel(val, target);
> > > +}
> > > +
> > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > +{
> > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > + const struct imsic_global_config *imsic_global;
> > > +
> > > + /*
> > > + * The APLIC outgoing MSI config registers assume target MSI
> > > + * controller to be RISC-V AIA IMSIC controller.
> > > + */
> > > + imsic_global = imsic_get_global_config();
> > > + if (!imsic_global) {
> > > + pr_err("%pfwP: IMSIC global config not found\n",
> > > + priv->fwnode);
> > > + return -ENODEV;
> > > + }
> > > +
> > > + /* Find number of guest index bits (LHXS) */
> > > + mc->lhxs = imsic_global->guest_index_bits;
> > > + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > + priv->fwnode);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + /* Find number of HART index bits (LHXW) */
> > > + mc->lhxw = imsic_global->hart_index_bits;
> > > + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > + priv->fwnode);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + /* Find number of group index bits (HHXW) */
> > > + mc->hhxw = imsic_global->group_index_bits;
> > > + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > + priv->fwnode);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + /* Find first bit position of group index (HHXS) */
> > > + mc->hhxs = imsic_global->group_index_shift;
> > > + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > + return -EINVAL;
> > > + }
> > > + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > + priv->fwnode);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + /* Compute PPN base */
> > > + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > +
> > > + /* Use all possible CPUs as lmask */
> > > + cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +/*
> > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > + * which will return highest priority pending interrupt and clear the
> > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > + * register return zero value.
> > > + */
> > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > +{
> > > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > > + irq_hw_number_t hw_irq;
> > > + int irq;
> > > +
> > > + chained_irq_enter(chip, desc);
> > > +
> > > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > +
> > > + if (unlikely(irq <= 0))
> > > + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > + hw_irq);
> > > + else
> > > + generic_handle_irq(irq);
> > > + }
> > > +
> > > + chained_irq_exit(chip, desc);
> > > +}
> > > +
> > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > +{
> > > + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > +
> > > + /* Priority must be less than threshold for interrupt triggering */
> > > + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > +
> > > + /* Delivery must be set to 1 for interrupt triggering */
> > > + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > +}
> > > +
> > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > +{
> > > + if (aplic_idc_parent_irq)
> > > + disable_percpu_irq(aplic_idc_parent_irq);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > +{
> > > + if (aplic_idc_parent_irq)
> > > + enable_percpu_irq(aplic_idc_parent_irq,
> > > + irq_get_trigger_type(aplic_idc_parent_irq));
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > +{
> > > + int i, j, rc, cpu, setup_count = 0;
> > > + struct fwnode_reference_args parent;
> > > + struct irq_domain *domain;
> > > + unsigned long hartid;
> > > + struct aplic_idc *idc;
> > > + u32 val;
> > > +
> > > + /* Setup per-CPU IDC and target CPU mask */
> > > + for (i = 0; i < priv->nr_idcs; i++) {
> > > + rc = fwnode_property_get_reference_args(priv->fwnode,
> > > + "interrupts-extended", "#interrupt-cells",
> > > + 0, i, &parent);
> > > + if (rc) {
> > > + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > + priv->fwnode, i);
> > > + continue;
> > > + }
> > > +
> > > + /*
> > > + * Skip interrupts other than external interrupts for
> > > + * current privilege level.
> > > + */
> > > + if (parent.args[0] != RV_IRQ_EXT)
> > > + continue;
> > > +
> > > + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > + if (rc) {
> > > + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > + priv->fwnode, i);
> > > + continue;
> > > + }
> > > +
> > > + cpu = riscv_hartid_to_cpuid(hartid);
> > > + if (cpu < 0) {
> > > + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > + priv->fwnode, i);
> > > + continue;
> > > + }
> > > +
> > > + cpumask_set_cpu(cpu, &priv->lmask);
> > > +
> > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > + idc->hart_index = i;
> > > + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > + idc->priv = priv;
> > > +
> > > + aplic_idc_set_delivery(idc, true);
> > > +
> > > + /*
> > > + * Boot cpu might not have APLIC hart_index = 0 so check
> > > + * and update target registers of all interrupts.
> > > + */
> > > + if (cpu == smp_processor_id() && idc->hart_index) {
> > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > + val |= APLIC_DEFAULT_PRIORITY;
> > > + for (j = 1; j <= priv->nr_irqs; j++)
> > > + writel(val, priv->regs + APLIC_TARGET_BASE +
> > > + (j - 1) * sizeof(u32));
> > > + }
> > > +
> > > + setup_count++;
> > > + }
> > > +
> > > + /* Find parent domain and register chained handler */
> > > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > + DOMAIN_BUS_ANY);
> > > + if (!aplic_idc_parent_irq && domain) {
> > > + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > + if (aplic_idc_parent_irq) {
> > > + irq_set_chained_handler(aplic_idc_parent_irq,
> > > + aplic_idc_handle_irq);
> > > +
> > > + /*
> > > + * Setup CPUHP notifier to enable IDC parent
> > > + * interrupt on all CPUs
> > > + */
> > > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > + "irqchip/riscv/aplic:starting",
> > > + aplic_idc_starting_cpu,
> > > + aplic_idc_dying_cpu);
> > > + }
> > > + }
> > > +
> > > + /* Fail if we were not able to setup IDC for any CPU */
> > > + return (setup_count) ? 0 : -ENODEV;
> > > +}
> > > +
> > > +static int aplic_probe(struct platform_device *pdev)
> > > +{
> > > + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > + struct fwnode_reference_args parent;
> > > + struct aplic_priv *priv;
> > > + struct resource *res;
> > > + phys_addr_t pa;
> > > + int rc;
> > > +
> > > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > + if (!priv)
> > > + return -ENOMEM;
> > > + priv->fwnode = fwnode;
> > > +
> > > + /* Map the MMIO registers */
> > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > + if (!res) {
> > > + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > + return -EINVAL;
> > > + }
> > > + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > + if (!priv->regs) {
> > > + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > + return -ENOMEM;
> > > + }
> > > +
> > > + /*
> > > + * Find out GSI base number
> > > + *
> > > + * Note: DT does not define "riscv,gsi-base" property so GSI
> > > + * base is always zero for DT.
> > > + */
> > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > + &priv->gsi_base, 1);
> > > + if (rc)
> > > + priv->gsi_base = 0;
> > > +
> > > + /* Find out number of interrupt sources */
> > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > + &priv->nr_irqs, 1);
> > > + if (rc) {
> > > + pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > + fwnode);
> > > + return rc;
> > > + }
> > > +
> > > + /* Setup initial state APLIC interrupts */
> > > + aplic_init_hw_irqs(priv);
> > > +
> > > + /*
> > > + * Find out number of IDCs based on parent interrupts
> > > + *
> > > + * If "msi-parent" property is present then we ignore the
> > > + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > + */
> > > + if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > + while (!fwnode_property_get_reference_args(fwnode,
> > > + "interrupts-extended", "#interrupt-cells",
> > > + 0, priv->nr_idcs, &parent))
> > > + priv->nr_idcs++;
> > > + }
> > > +
> > > + /* Setup IDCs or MSIs based on number of IDCs */
> > > + if (priv->nr_idcs)
> > > + rc = aplic_setup_idc(priv);
> > > + else
> > > + rc = aplic_setup_msi(priv);
> > > + if (rc) {
> > > + pr_err("%pfwP: failed setup %s\n",
> > > + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > + return rc;
> > > + }
> > > +
> > > + /* Setup global config and interrupt delivery */
> > > + aplic_init_hw_global(priv);
> > > +
> > > + /* Create irq domain instance for the APLIC */
> > > + if (priv->nr_idcs)
> > > + priv->irqdomain = irq_domain_create_linear(
> > > + priv->fwnode,
> > > + priv->nr_irqs + 1,
> > > + &aplic_irqdomain_idc_ops,
> > > + priv);
> > > + else
> > > + priv->irqdomain = platform_msi_create_device_domain(
> > > + &pdev->dev,
> > > + priv->nr_irqs + 1,
> > > + aplic_msi_write_msg,
> > > + &aplic_irqdomain_msi_ops,
> > > + priv);
> > > + if (!priv->irqdomain) {
> > > + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > + return -ENOMEM;
> > > + }
> > > +
> > > + /* Advertise the interrupt controller */
> > > + if (priv->nr_idcs) {
> > > + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > + } else {
> > > + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > + priv->fwnode, priv->nr_irqs, &pa);
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static const struct of_device_id aplic_match[] = {
> > > + { .compatible = "riscv,aplic" },
> > > + {}
> > > +};
> > > +
> > > +static struct platform_driver aplic_driver = {
> > > + .driver = {
> > > + .name = "riscv-aplic",
> > > + .of_match_table = aplic_match,
> > > + },
> > > + .probe = aplic_probe,
> > > +};
> > > +builtin_platform_driver(aplic_driver);
> > > +
> > > +static int __init aplic_dt_init(struct device_node *node,
> > > + struct device_node *parent)
> > > +{
> > > + /*
> > > + * The APLIC platform driver needs to be probed early
> > > + * so for device tree:
> > > + *
> > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > + * provides a hint to the device driver core to probe the
> > > + * platform driver early.
> > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > + * of_irq_init() sets it which prevents creation of
> > > + * platform device.
> > > + */
> > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> >
> > NACK. You are blindly plastering flags without trying to understand
> > the real issue and fixing this correctly.
> >
> > > + of_node_clear_flag(node, OF_POPULATED);
> > > + return 0;
> > > +}
> > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> >
> > This macro pretty much skips the entire driver core framework to probe
> > and calls init and you are supposed to initialize the device when the
> > init function is called.
> >
> > If you want your device/driver to follow the proper platform driver
> > path (which is recommended), then you need to use the
> > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> >
> > I offered to help you debug this issue and I asked for a dts file that
> > corresponds to a board you are testing this on and seeing an issue.
> > But you haven't answered my question [1] and are pointing to some
> > random commit and blaming it. That commit has no impact on any
> > existing devices/drivers.
> >
> > Hi Marc,
> >
> > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > is used or until Anup actually works with us to debug the real issue.
>
> Maybe I misread your previous comment.
>
> You can easily reproduce the issue on QEMU virt machine for RISC-V:
> 1) Build qemu-system-riscv64 from latest QEMU master
> 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> APLIC driver at the time of building kernel)
> 3) Boot a APLIC-only system on QEMU virt machine
> qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> -kernel ./build-riscv64/arch/riscv/boot/Image \
> -append "root=/dev/ram rw console=ttyS0 earlycon" \
> -initrd ./rootfs_riscv64.img

Unfortunately, I don't have the time to do all that, but I generally
don't need to run something to figure out the issue. It's generally
fairly obvious once I look at the DT. I'll also lean on you for some
debug logs.

Where is the dts file that corresponds to this QEMU run? This is the
third time I'm asking for a pointer to a dts file that has this issue,
can you point me to it please? I shouldn't have to say this but: put
it somewhere and point me to it please. Please don't point me to some
git repo and ask me to dig around.

Can you give me details on what supplier is causing the deferred probe
that's a problem for you? Any other details you can provide that'll
help debug this issue?

> I hope the above steps help you reproduce the issue. I will certainly
> test whatever fix you propose.

Do you plan to try the fix I suggested already? The one about using
the correct macros?

-Saravana

2023-06-19 06:25:33

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <[email protected]> wrote:
>
> On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <[email protected]> wrote:
> >
> > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <[email protected]> wrote:
> > >
> > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> > > >
> > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > platform. This new interrupt controller is referred to as advanced
> > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > signaled interrupts.
> > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > >
> > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > platforms.
> > > >
> > > > Signed-off-by: Anup Patel <[email protected]>
> > > > ---
> > > > drivers/irqchip/Kconfig | 6 +
> > > > drivers/irqchip/Makefile | 1 +
> > > > drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> > > > include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > > 4 files changed, 891 insertions(+)
> > > > create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > > create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > >
> > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > index d700980372ef..834c0329f583 100644
> > > > --- a/drivers/irqchip/Kconfig
> > > > +++ b/drivers/irqchip/Kconfig
> > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > > select IRQ_DOMAIN_HIERARCHY
> > > > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > >
> > > > +config RISCV_APLIC
> > > > + bool
> > > > + depends on RISCV
> > > > + select IRQ_DOMAIN_HIERARCHY
> > > > + select GENERIC_MSI_IRQ
> > > > +
> > > > config RISCV_IMSIC
> > > > bool
> > > > depends on RISCV
> > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > index 577bde3e986b..438b8e1a152c 100644
> > > > --- a/drivers/irqchip/Makefile
> > > > +++ b/drivers/irqchip/Makefile
> > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > > > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > > > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > > > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > > > +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> > > > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > > > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > > > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > new file mode 100644
> > > > index 000000000000..1e710fdf5608
> > > > --- /dev/null
> > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > @@ -0,0 +1,765 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > + */
> > > > +
> > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > +#include <linux/bitops.h>
> > > > +#include <linux/cpu.h>
> > > > +#include <linux/interrupt.h>
> > > > +#include <linux/io.h>
> > > > +#include <linux/irq.h>
> > > > +#include <linux/irqchip.h>
> > > > +#include <linux/irqchip/chained_irq.h>
> > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > +#include <linux/irqdomain.h>
> > > > +#include <linux/module.h>
> > > > +#include <linux/msi.h>
> > > > +#include <linux/platform_device.h>
> > > > +#include <linux/smp.h>
> > > > +
> > > > +#define APLIC_DEFAULT_PRIORITY 1
> > > > +#define APLIC_DISABLE_IDELIVERY 0
> > > > +#define APLIC_ENABLE_IDELIVERY 1
> > > > +#define APLIC_DISABLE_ITHRESHOLD 1
> > > > +#define APLIC_ENABLE_ITHRESHOLD 0
> > > > +
> > > > +struct aplic_msicfg {
> > > > + phys_addr_t base_ppn;
> > > > + u32 hhxs;
> > > > + u32 hhxw;
> > > > + u32 lhxs;
> > > > + u32 lhxw;
> > > > +};
> > > > +
> > > > +struct aplic_idc {
> > > > + unsigned int hart_index;
> > > > + void __iomem *regs;
> > > > + struct aplic_priv *priv;
> > > > +};
> > > > +
> > > > +struct aplic_priv {
> > > > + struct fwnode_handle *fwnode;
> > > > + u32 gsi_base;
> > > > + u32 nr_irqs;
> > > > + u32 nr_idcs;
> > > > + void __iomem *regs;
> > > > + struct irq_domain *irqdomain;
> > > > + struct aplic_msicfg msicfg;
> > > > + struct cpumask lmask;
> > > > +};
> > > > +
> > > > +static unsigned int aplic_idc_parent_irq;
> > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > +
> > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > +{
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > +
> > > > + if (!priv->nr_idcs)
> > > > + irq_chip_unmask_parent(d);
> > > > +}
> > > > +
> > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > +{
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > +
> > > > + if (!priv->nr_idcs)
> > > > + irq_chip_mask_parent(d);
> > > > +}
> > > > +
> > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > +{
> > > > + u32 val = 0;
> > > > + void __iomem *sourcecfg;
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > +
> > > > + switch (type) {
> > > > + case IRQ_TYPE_NONE:
> > > > + val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > + break;
> > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > + break;
> > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > + break;
> > > > + case IRQ_TYPE_EDGE_FALLING:
> > > > + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > + break;
> > > > + case IRQ_TYPE_EDGE_RISING:
> > > > + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > + break;
> > > > + default:
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > + writel(val, sourcecfg);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > +{
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > + u32 reg_off, reg_mask;
> > > > +
> > > > + /*
> > > > + * EOI handling only required only for level-triggered
> > > > + * interrupts in APLIC MSI mode.
> > > > + */
> > > > +
> > > > + if (priv->nr_idcs)
> > > > + return;
> > > > +
> > > > + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > + switch (irqd_get_trigger_type(d)) {
> > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > + if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > + break;
> > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > + if (readl(priv->regs + reg_off) & reg_mask)
> > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > + break;
> > > > + }
> > > > +}
> > > > +
> > > > +#ifdef CONFIG_SMP
> > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > + const struct cpumask *mask_val, bool force)
> > > > +{
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > + struct aplic_idc *idc;
> > > > + unsigned int cpu, val;
> > > > + struct cpumask amask;
> > > > + void __iomem *target;
> > > > +
> > > > + if (!priv->nr_idcs)
> > > > + return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > +
> > > > + cpumask_and(&amask, &priv->lmask, mask_val);
> > > > +
> > > > + if (force)
> > > > + cpu = cpumask_first(&amask);
> > > > + else
> > > > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > +
> > > > + if (cpu >= nr_cpu_ids)
> > > > + return -EINVAL;
> > > > +
> > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > + writel(val, target);
> > > > +
> > > > + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > +
> > > > + return IRQ_SET_MASK_OK_DONE;
> > > > +}
> > > > +#endif
> > > > +
> > > > +static struct irq_chip aplic_chip = {
> > > > + .name = "RISC-V APLIC",
> > > > + .irq_mask = aplic_irq_mask,
> > > > + .irq_unmask = aplic_irq_unmask,
> > > > + .irq_set_type = aplic_set_type,
> > > > + .irq_eoi = aplic_irq_eoi,
> > > > +#ifdef CONFIG_SMP
> > > > + .irq_set_affinity = aplic_set_affinity,
> > > > +#endif
> > > > + .flags = IRQCHIP_SET_TYPE_MASKED |
> > > > + IRQCHIP_SKIP_SET_WAKE |
> > > > + IRQCHIP_MASK_ON_SUSPEND,
> > > > +};
> > > > +
> > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > + u32 gsi_base,
> > > > + unsigned long *hwirq,
> > > > + unsigned int *type)
> > > > +{
> > > > + if (WARN_ON(fwspec->param_count < 2))
> > > > + return -EINVAL;
> > > > + if (WARN_ON(!fwspec->param[0]))
> > > > + return -EINVAL;
> > > > +
> > > > + /* For DT, gsi_base is always zero. */
> > > > + *hwirq = fwspec->param[0] - gsi_base;
> > > > + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > +
> > > > + WARN_ON(*type == IRQ_TYPE_NONE);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > + struct irq_fwspec *fwspec,
> > > > + unsigned long *hwirq,
> > > > + unsigned int *type)
> > > > +{
> > > > + struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > +
> > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > + unsigned int virq, unsigned int nr_irqs,
> > > > + void *arg)
> > > > +{
> > > > + int i, ret;
> > > > + unsigned int type;
> > > > + irq_hw_number_t hwirq;
> > > > + struct irq_fwspec *fwspec = arg;
> > > > + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > +
> > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + for (i = 0; i < nr_irqs; i++) {
> > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > + NULL, NULL);
> > > > + /*
> > > > + * APLIC does not implement irq_disable() so Linux interrupt
> > > > + * subsystem will take a lazy approach for disabling an APLIC
> > > > + * interrupt. This means APLIC interrupts are left unmasked
> > > > + * upon system suspend and interrupts are not processed
> > > > + * immediately upon system wake up. To tackle this, we disable
> > > > + * the lazy approach for all APLIC interrupts.
> > > > + */
> > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > + .translate = aplic_irqdomain_msi_translate,
> > > > + .alloc = aplic_irqdomain_msi_alloc,
> > > > + .free = platform_msi_device_domain_free,
> > > > +};
> > > > +
> > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > + struct irq_fwspec *fwspec,
> > > > + unsigned long *hwirq,
> > > > + unsigned int *type)
> > > > +{
> > > > + struct aplic_priv *priv = d->host_data;
> > > > +
> > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > +}
> > > > +
> > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > + unsigned int virq, unsigned int nr_irqs,
> > > > + void *arg)
> > > > +{
> > > > + int i, ret;
> > > > + unsigned int type;
> > > > + irq_hw_number_t hwirq;
> > > > + struct irq_fwspec *fwspec = arg;
> > > > + struct aplic_priv *priv = domain->host_data;
> > > > +
> > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + for (i = 0; i < nr_irqs; i++) {
> > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > + NULL, NULL);
> > > > + irq_set_affinity(virq + i, &priv->lmask);
> > > > + /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > + .translate = aplic_irqdomain_idc_translate,
> > > > + .alloc = aplic_irqdomain_idc_alloc,
> > > > + .free = irq_domain_free_irqs_top,
> > > > +};
> > > > +
> > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > +{
> > > > + int i;
> > > > +
> > > > + /* Disable all interrupts */
> > > > + for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > + (i / 32) * sizeof(u32));
> > > > +
> > > > + /* Set interrupt type and default priority for all interrupts */
> > > > + for (i = 1; i <= priv->nr_irqs; i++) {
> > > > + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > + (i - 1) * sizeof(u32));
> > > > + writel(APLIC_DEFAULT_PRIORITY,
> > > > + priv->regs + APLIC_TARGET_BASE +
> > > > + (i - 1) * sizeof(u32));
> > > > + }
> > > > +
> > > > + /* Clear APLIC domaincfg */
> > > > + writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > +}
> > > > +
> > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > +{
> > > > + u32 val;
> > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > + u32 valH;
> > > > +
> > > > + if (!priv->nr_idcs) {
> > > > + val = priv->msicfg.base_ppn;
> > > > + valH = (priv->msicfg.base_ppn >> 32) &
> > > > + APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > + }
> > > > +#endif
> > > > +
> > > > + /* Setup APLIC domaincfg register */
> > > > + val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > + val |= APLIC_DOMAINCFG_IE;
> > > > + if (!priv->nr_idcs)
> > > > + val |= APLIC_DOMAINCFG_DM;
> > > > + writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > + priv->fwnode, val);
> > > > +}
> > > > +
> > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > +{
> > > > + unsigned int group_index, hart_index, guest_index, val;
> > > > + struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > + phys_addr_t tppn, tbppn, msg_addr;
> > > > + void __iomem *target;
> > > > +
> > > > + /* For zeroed MSI, simply write zero into the target register */
> > > > + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > + writel(0, target);
> > > > + return;
> > > > + }
> > > > +
> > > > + /* Sanity check on message data */
> > > > + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > +
> > > > + /* Compute target MSI address */
> > > > + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > +
> > > > + /* Compute target HART Base PPN */
> > > > + tbppn = tppn;
> > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > + WARN_ON(tbppn != mc->base_ppn);
> > > > +
> > > > + /* Compute target group and hart indexes */
> > > > + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > + hart_index |= (group_index << mc->lhxw);
> > > > + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > +
> > > > + /* Compute target guest index */
> > > > + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > +
> > > > + /* Update IRQ TARGET register */
> > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > + << APLIC_TARGET_HART_IDX_SHIFT;
> > > > + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > + << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > + writel(val, target);
> > > > +}
> > > > +
> > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > +{
> > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > + const struct imsic_global_config *imsic_global;
> > > > +
> > > > + /*
> > > > + * The APLIC outgoing MSI config registers assume target MSI
> > > > + * controller to be RISC-V AIA IMSIC controller.
> > > > + */
> > > > + imsic_global = imsic_get_global_config();
> > > > + if (!imsic_global) {
> > > > + pr_err("%pfwP: IMSIC global config not found\n",
> > > > + priv->fwnode);
> > > > + return -ENODEV;
> > > > + }
> > > > +
> > > > + /* Find number of guest index bits (LHXS) */
> > > > + mc->lhxs = imsic_global->guest_index_bits;
> > > > + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > + priv->fwnode);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + /* Find number of HART index bits (LHXW) */
> > > > + mc->lhxw = imsic_global->hart_index_bits;
> > > > + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > + priv->fwnode);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + /* Find number of group index bits (HHXW) */
> > > > + mc->hhxw = imsic_global->group_index_bits;
> > > > + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > + priv->fwnode);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + /* Find first bit position of group index (HHXS) */
> > > > + mc->hhxs = imsic_global->group_index_shift;
> > > > + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > + return -EINVAL;
> > > > + }
> > > > + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > + priv->fwnode);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + /* Compute PPN base */
> > > > + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > +
> > > > + /* Use all possible CPUs as lmask */
> > > > + cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > + * which will return highest priority pending interrupt and clear the
> > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > + * register return zero value.
> > > > + */
> > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > +{
> > > > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > + irq_hw_number_t hw_irq;
> > > > + int irq;
> > > > +
> > > > + chained_irq_enter(chip, desc);
> > > > +
> > > > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > +
> > > > + if (unlikely(irq <= 0))
> > > > + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > + hw_irq);
> > > > + else
> > > > + generic_handle_irq(irq);
> > > > + }
> > > > +
> > > > + chained_irq_exit(chip, desc);
> > > > +}
> > > > +
> > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > +{
> > > > + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > +
> > > > + /* Priority must be less than threshold for interrupt triggering */
> > > > + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > +
> > > > + /* Delivery must be set to 1 for interrupt triggering */
> > > > + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > +}
> > > > +
> > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > +{
> > > > + if (aplic_idc_parent_irq)
> > > > + disable_percpu_irq(aplic_idc_parent_irq);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > +{
> > > > + if (aplic_idc_parent_irq)
> > > > + enable_percpu_irq(aplic_idc_parent_irq,
> > > > + irq_get_trigger_type(aplic_idc_parent_irq));
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > +{
> > > > + int i, j, rc, cpu, setup_count = 0;
> > > > + struct fwnode_reference_args parent;
> > > > + struct irq_domain *domain;
> > > > + unsigned long hartid;
> > > > + struct aplic_idc *idc;
> > > > + u32 val;
> > > > +
> > > > + /* Setup per-CPU IDC and target CPU mask */
> > > > + for (i = 0; i < priv->nr_idcs; i++) {
> > > > + rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > + "interrupts-extended", "#interrupt-cells",
> > > > + 0, i, &parent);
> > > > + if (rc) {
> > > > + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > + priv->fwnode, i);
> > > > + continue;
> > > > + }
> > > > +
> > > > + /*
> > > > + * Skip interrupts other than external interrupts for
> > > > + * current privilege level.
> > > > + */
> > > > + if (parent.args[0] != RV_IRQ_EXT)
> > > > + continue;
> > > > +
> > > > + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > + if (rc) {
> > > > + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > + priv->fwnode, i);
> > > > + continue;
> > > > + }
> > > > +
> > > > + cpu = riscv_hartid_to_cpuid(hartid);
> > > > + if (cpu < 0) {
> > > > + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > + priv->fwnode, i);
> > > > + continue;
> > > > + }
> > > > +
> > > > + cpumask_set_cpu(cpu, &priv->lmask);
> > > > +
> > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > + idc->hart_index = i;
> > > > + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > + idc->priv = priv;
> > > > +
> > > > + aplic_idc_set_delivery(idc, true);
> > > > +
> > > > + /*
> > > > + * Boot cpu might not have APLIC hart_index = 0 so check
> > > > + * and update target registers of all interrupts.
> > > > + */
> > > > + if (cpu == smp_processor_id() && idc->hart_index) {
> > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > + for (j = 1; j <= priv->nr_irqs; j++)
> > > > + writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > + (j - 1) * sizeof(u32));
> > > > + }
> > > > +
> > > > + setup_count++;
> > > > + }
> > > > +
> > > > + /* Find parent domain and register chained handler */
> > > > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > + DOMAIN_BUS_ANY);
> > > > + if (!aplic_idc_parent_irq && domain) {
> > > > + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > + if (aplic_idc_parent_irq) {
> > > > + irq_set_chained_handler(aplic_idc_parent_irq,
> > > > + aplic_idc_handle_irq);
> > > > +
> > > > + /*
> > > > + * Setup CPUHP notifier to enable IDC parent
> > > > + * interrupt on all CPUs
> > > > + */
> > > > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > + "irqchip/riscv/aplic:starting",
> > > > + aplic_idc_starting_cpu,
> > > > + aplic_idc_dying_cpu);
> > > > + }
> > > > + }
> > > > +
> > > > + /* Fail if we were not able to setup IDC for any CPU */
> > > > + return (setup_count) ? 0 : -ENODEV;
> > > > +}
> > > > +
> > > > +static int aplic_probe(struct platform_device *pdev)
> > > > +{
> > > > + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > + struct fwnode_reference_args parent;
> > > > + struct aplic_priv *priv;
> > > > + struct resource *res;
> > > > + phys_addr_t pa;
> > > > + int rc;
> > > > +
> > > > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > + if (!priv)
> > > > + return -ENOMEM;
> > > > + priv->fwnode = fwnode;
> > > > +
> > > > + /* Map the MMIO registers */
> > > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > + if (!res) {
> > > > + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > + return -EINVAL;
> > > > + }
> > > > + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > + if (!priv->regs) {
> > > > + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > + return -ENOMEM;
> > > > + }
> > > > +
> > > > + /*
> > > > + * Find out GSI base number
> > > > + *
> > > > + * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > + * base is always zero for DT.
> > > > + */
> > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > + &priv->gsi_base, 1);
> > > > + if (rc)
> > > > + priv->gsi_base = 0;
> > > > +
> > > > + /* Find out number of interrupt sources */
> > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > + &priv->nr_irqs, 1);
> > > > + if (rc) {
> > > > + pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > + fwnode);
> > > > + return rc;
> > > > + }
> > > > +
> > > > + /* Setup initial state APLIC interrupts */
> > > > + aplic_init_hw_irqs(priv);
> > > > +
> > > > + /*
> > > > + * Find out number of IDCs based on parent interrupts
> > > > + *
> > > > + * If "msi-parent" property is present then we ignore the
> > > > + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > + */
> > > > + if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > + while (!fwnode_property_get_reference_args(fwnode,
> > > > + "interrupts-extended", "#interrupt-cells",
> > > > + 0, priv->nr_idcs, &parent))
> > > > + priv->nr_idcs++;
> > > > + }
> > > > +
> > > > + /* Setup IDCs or MSIs based on number of IDCs */
> > > > + if (priv->nr_idcs)
> > > > + rc = aplic_setup_idc(priv);
> > > > + else
> > > > + rc = aplic_setup_msi(priv);
> > > > + if (rc) {
> > > > + pr_err("%pfwP: failed setup %s\n",
> > > > + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > + return rc;
> > > > + }
> > > > +
> > > > + /* Setup global config and interrupt delivery */
> > > > + aplic_init_hw_global(priv);
> > > > +
> > > > + /* Create irq domain instance for the APLIC */
> > > > + if (priv->nr_idcs)
> > > > + priv->irqdomain = irq_domain_create_linear(
> > > > + priv->fwnode,
> > > > + priv->nr_irqs + 1,
> > > > + &aplic_irqdomain_idc_ops,
> > > > + priv);
> > > > + else
> > > > + priv->irqdomain = platform_msi_create_device_domain(
> > > > + &pdev->dev,
> > > > + priv->nr_irqs + 1,
> > > > + aplic_msi_write_msg,
> > > > + &aplic_irqdomain_msi_ops,
> > > > + priv);
> > > > + if (!priv->irqdomain) {
> > > > + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > + return -ENOMEM;
> > > > + }
> > > > +
> > > > + /* Advertise the interrupt controller */
> > > > + if (priv->nr_idcs) {
> > > > + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > + } else {
> > > > + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > + priv->fwnode, priv->nr_irqs, &pa);
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static const struct of_device_id aplic_match[] = {
> > > > + { .compatible = "riscv,aplic" },
> > > > + {}
> > > > +};
> > > > +
> > > > +static struct platform_driver aplic_driver = {
> > > > + .driver = {
> > > > + .name = "riscv-aplic",
> > > > + .of_match_table = aplic_match,
> > > > + },
> > > > + .probe = aplic_probe,
> > > > +};
> > > > +builtin_platform_driver(aplic_driver);
> > > > +
> > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > + struct device_node *parent)
> > > > +{
> > > > + /*
> > > > + * The APLIC platform driver needs to be probed early
> > > > + * so for device tree:
> > > > + *
> > > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > + * provides a hint to the device driver core to probe the
> > > > + * platform driver early.
> > > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > > + * of_irq_init() sets it which prevents creation of
> > > > + * platform device.
> > > > + */
> > > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > >
> > > NACK. You are blindly plastering flags without trying to understand
> > > the real issue and fixing this correctly.
> > >
> > > > + of_node_clear_flag(node, OF_POPULATED);
> > > > + return 0;
> > > > +}
> > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > >
> > > This macro pretty much skips the entire driver core framework to probe
> > > and calls init and you are supposed to initialize the device when the
> > > init function is called.
> > >
> > > If you want your device/driver to follow the proper platform driver
> > > path (which is recommended), then you need to use the
> > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > >
> > > I offered to help you debug this issue and I asked for a dts file that
> > > corresponds to a board you are testing this on and seeing an issue.
> > > But you haven't answered my question [1] and are pointing to some
> > > random commit and blaming it. That commit has no impact on any
> > > existing devices/drivers.
> > >
> > > Hi Marc,
> > >
> > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > is used or until Anup actually works with us to debug the real issue.
> >
> > Maybe I misread your previous comment.
> >
> > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > 1) Build qemu-system-riscv64 from latest QEMU master
> > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> > APLIC driver at the time of building kernel)
> > 3) Boot a APLIC-only system on QEMU virt machine
> > qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> > -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> > -kernel ./build-riscv64/arch/riscv/boot/Image \
> > -append "root=/dev/ram rw console=ttyS0 earlycon" \
> > -initrd ./rootfs_riscv64.img
>
> Unfortunately, I don't have the time to do all that, but I generally
> don't need to run something to figure out the issue. It's generally
> fairly obvious once I look at the DT. I'll also lean on you for some
> debug logs.

The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
found at:
https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing

The boot log without FWNODE_BEST_EFFORT flag in APLIC can
be found at:
https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing

>
> Where is the dts file that corresponds to this QEMU run? This is the
> third time I'm asking for a pointer to a dts file that has this issue,
> can you point me to it please? I shouldn't have to say this but: put
> it somewhere and point me to it please. Please don't point me to some
> git repo and ask me to dig around.

For QEMU virt machine, the DTB is generated at runtime as part of
virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
command line option can be found at:
https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing

>
> Can you give me details on what supplier is causing the deferred probe
> that's a problem for you? Any other details you can provide that'll
> help debug this issue?

FWNODE supplier for APLIC DT node is the OF framework.

>
> > I hope the above steps help you reproduce the issue. I will certainly
> > test whatever fix you propose.
>
> Do you plan to try the fix I suggested already? The one about using
> the correct macros?

You mean use IRQCHIP_DECLARE() in the APLIC driver ?
or something else ?

Regards,
Anup

2023-06-22 21:07:15

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Sun, Jun 18, 2023 at 11:13 PM Anup Patel <[email protected]> wrote:
>
> On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <[email protected]> wrote:
> >
> > On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <[email protected]> wrote:
> > >
> > > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <[email protected]> wrote:
> > > >
> > > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> > > > >
> > > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > > platform. This new interrupt controller is referred to as advanced
> > > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > > signaled interrupts.
> > > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > > >
> > > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > > platforms.
> > > > >
> > > > > Signed-off-by: Anup Patel <[email protected]>
> > > > > ---
> > > > > drivers/irqchip/Kconfig | 6 +
> > > > > drivers/irqchip/Makefile | 1 +
> > > > > drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> > > > > include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > > > 4 files changed, 891 insertions(+)
> > > > > create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > > > create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > > >
> > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > > index d700980372ef..834c0329f583 100644
> > > > > --- a/drivers/irqchip/Kconfig
> > > > > +++ b/drivers/irqchip/Kconfig
> > > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > > > select IRQ_DOMAIN_HIERARCHY
> > > > > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > > >
> > > > > +config RISCV_APLIC
> > > > > + bool
> > > > > + depends on RISCV
> > > > > + select IRQ_DOMAIN_HIERARCHY
> > > > > + select GENERIC_MSI_IRQ
> > > > > +
> > > > > config RISCV_IMSIC
> > > > > bool
> > > > > depends on RISCV
> > > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > > index 577bde3e986b..438b8e1a152c 100644
> > > > > --- a/drivers/irqchip/Makefile
> > > > > +++ b/drivers/irqchip/Makefile
> > > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > > > > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > > > > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > > > > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > > > > +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> > > > > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > > > > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > > > > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > > new file mode 100644
> > > > > index 000000000000..1e710fdf5608
> > > > > --- /dev/null
> > > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > > @@ -0,0 +1,765 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > +/*
> > > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > > + */
> > > > > +
> > > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > > +#include <linux/bitops.h>
> > > > > +#include <linux/cpu.h>
> > > > > +#include <linux/interrupt.h>
> > > > > +#include <linux/io.h>
> > > > > +#include <linux/irq.h>
> > > > > +#include <linux/irqchip.h>
> > > > > +#include <linux/irqchip/chained_irq.h>
> > > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > > +#include <linux/irqdomain.h>
> > > > > +#include <linux/module.h>
> > > > > +#include <linux/msi.h>
> > > > > +#include <linux/platform_device.h>
> > > > > +#include <linux/smp.h>
> > > > > +
> > > > > +#define APLIC_DEFAULT_PRIORITY 1
> > > > > +#define APLIC_DISABLE_IDELIVERY 0
> > > > > +#define APLIC_ENABLE_IDELIVERY 1
> > > > > +#define APLIC_DISABLE_ITHRESHOLD 1
> > > > > +#define APLIC_ENABLE_ITHRESHOLD 0
> > > > > +
> > > > > +struct aplic_msicfg {
> > > > > + phys_addr_t base_ppn;
> > > > > + u32 hhxs;
> > > > > + u32 hhxw;
> > > > > + u32 lhxs;
> > > > > + u32 lhxw;
> > > > > +};
> > > > > +
> > > > > +struct aplic_idc {
> > > > > + unsigned int hart_index;
> > > > > + void __iomem *regs;
> > > > > + struct aplic_priv *priv;
> > > > > +};
> > > > > +
> > > > > +struct aplic_priv {
> > > > > + struct fwnode_handle *fwnode;
> > > > > + u32 gsi_base;
> > > > > + u32 nr_irqs;
> > > > > + u32 nr_idcs;
> > > > > + void __iomem *regs;
> > > > > + struct irq_domain *irqdomain;
> > > > > + struct aplic_msicfg msicfg;
> > > > > + struct cpumask lmask;
> > > > > +};
> > > > > +
> > > > > +static unsigned int aplic_idc_parent_irq;
> > > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > > +
> > > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > > +{
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > > +
> > > > > + if (!priv->nr_idcs)
> > > > > + irq_chip_unmask_parent(d);
> > > > > +}
> > > > > +
> > > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > > +{
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > > +
> > > > > + if (!priv->nr_idcs)
> > > > > + irq_chip_mask_parent(d);
> > > > > +}
> > > > > +
> > > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > > +{
> > > > > + u32 val = 0;
> > > > > + void __iomem *sourcecfg;
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > +
> > > > > + switch (type) {
> > > > > + case IRQ_TYPE_NONE:
> > > > > + val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > > + break;
> > > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > > + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > > + break;
> > > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > > + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > > + break;
> > > > > + case IRQ_TYPE_EDGE_FALLING:
> > > > > + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > > + break;
> > > > > + case IRQ_TYPE_EDGE_RISING:
> > > > > + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > > + break;
> > > > > + default:
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > > + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > > + writel(val, sourcecfg);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > > +{
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > + u32 reg_off, reg_mask;
> > > > > +
> > > > > + /*
> > > > > + * EOI handling only required only for level-triggered
> > > > > + * interrupts in APLIC MSI mode.
> > > > > + */
> > > > > +
> > > > > + if (priv->nr_idcs)
> > > > > + return;
> > > > > +
> > > > > + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > > + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > > + switch (irqd_get_trigger_type(d)) {
> > > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > > + if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > + break;
> > > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > > + if (readl(priv->regs + reg_off) & reg_mask)
> > > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > + break;
> > > > > + }
> > > > > +}
> > > > > +
> > > > > +#ifdef CONFIG_SMP
> > > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > > + const struct cpumask *mask_val, bool force)
> > > > > +{
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > + struct aplic_idc *idc;
> > > > > + unsigned int cpu, val;
> > > > > + struct cpumask amask;
> > > > > + void __iomem *target;
> > > > > +
> > > > > + if (!priv->nr_idcs)
> > > > > + return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > > +
> > > > > + cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > +
> > > > > + if (force)
> > > > > + cpu = cpumask_first(&amask);
> > > > > + else
> > > > > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > +
> > > > > + if (cpu >= nr_cpu_ids)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > > + writel(val, target);
> > > > > +
> > > > > + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > +
> > > > > + return IRQ_SET_MASK_OK_DONE;
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > > +static struct irq_chip aplic_chip = {
> > > > > + .name = "RISC-V APLIC",
> > > > > + .irq_mask = aplic_irq_mask,
> > > > > + .irq_unmask = aplic_irq_unmask,
> > > > > + .irq_set_type = aplic_set_type,
> > > > > + .irq_eoi = aplic_irq_eoi,
> > > > > +#ifdef CONFIG_SMP
> > > > > + .irq_set_affinity = aplic_set_affinity,
> > > > > +#endif
> > > > > + .flags = IRQCHIP_SET_TYPE_MASKED |
> > > > > + IRQCHIP_SKIP_SET_WAKE |
> > > > > + IRQCHIP_MASK_ON_SUSPEND,
> > > > > +};
> > > > > +
> > > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > > + u32 gsi_base,
> > > > > + unsigned long *hwirq,
> > > > > + unsigned int *type)
> > > > > +{
> > > > > + if (WARN_ON(fwspec->param_count < 2))
> > > > > + return -EINVAL;
> > > > > + if (WARN_ON(!fwspec->param[0]))
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* For DT, gsi_base is always zero. */
> > > > > + *hwirq = fwspec->param[0] - gsi_base;
> > > > > + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > > +
> > > > > + WARN_ON(*type == IRQ_TYPE_NONE);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > > + struct irq_fwspec *fwspec,
> > > > > + unsigned long *hwirq,
> > > > > + unsigned int *type)
> > > > > +{
> > > > > + struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > > +
> > > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > > + unsigned int virq, unsigned int nr_irqs,
> > > > > + void *arg)
> > > > > +{
> > > > > + int i, ret;
> > > > > + unsigned int type;
> > > > > + irq_hw_number_t hwirq;
> > > > > + struct irq_fwspec *fwspec = arg;
> > > > > + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > > +
> > > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + for (i = 0; i < nr_irqs; i++) {
> > > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > > + NULL, NULL);
> > > > > + /*
> > > > > + * APLIC does not implement irq_disable() so Linux interrupt
> > > > > + * subsystem will take a lazy approach for disabling an APLIC
> > > > > + * interrupt. This means APLIC interrupts are left unmasked
> > > > > + * upon system suspend and interrupts are not processed
> > > > > + * immediately upon system wake up. To tackle this, we disable
> > > > > + * the lazy approach for all APLIC interrupts.
> > > > > + */
> > > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > + }
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > > + .translate = aplic_irqdomain_msi_translate,
> > > > > + .alloc = aplic_irqdomain_msi_alloc,
> > > > > + .free = platform_msi_device_domain_free,
> > > > > +};
> > > > > +
> > > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > > + struct irq_fwspec *fwspec,
> > > > > + unsigned long *hwirq,
> > > > > + unsigned int *type)
> > > > > +{
> > > > > + struct aplic_priv *priv = d->host_data;
> > > > > +
> > > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > +}
> > > > > +
> > > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > > + unsigned int virq, unsigned int nr_irqs,
> > > > > + void *arg)
> > > > > +{
> > > > > + int i, ret;
> > > > > + unsigned int type;
> > > > > + irq_hw_number_t hwirq;
> > > > > + struct irq_fwspec *fwspec = arg;
> > > > > + struct aplic_priv *priv = domain->host_data;
> > > > > +
> > > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + for (i = 0; i < nr_irqs; i++) {
> > > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > > + NULL, NULL);
> > > > > + irq_set_affinity(virq + i, &priv->lmask);
> > > > > + /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > + }
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > > + .translate = aplic_irqdomain_idc_translate,
> > > > > + .alloc = aplic_irqdomain_idc_alloc,
> > > > > + .free = irq_domain_free_irqs_top,
> > > > > +};
> > > > > +
> > > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > > +{
> > > > > + int i;
> > > > > +
> > > > > + /* Disable all interrupts */
> > > > > + for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > > + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > > + (i / 32) * sizeof(u32));
> > > > > +
> > > > > + /* Set interrupt type and default priority for all interrupts */
> > > > > + for (i = 1; i <= priv->nr_irqs; i++) {
> > > > > + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > > + (i - 1) * sizeof(u32));
> > > > > + writel(APLIC_DEFAULT_PRIORITY,
> > > > > + priv->regs + APLIC_TARGET_BASE +
> > > > > + (i - 1) * sizeof(u32));
> > > > > + }
> > > > > +
> > > > > + /* Clear APLIC domaincfg */
> > > > > + writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > > +}
> > > > > +
> > > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > > +{
> > > > > + u32 val;
> > > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > > + u32 valH;
> > > > > +
> > > > > + if (!priv->nr_idcs) {
> > > > > + val = priv->msicfg.base_ppn;
> > > > > + valH = (priv->msicfg.base_ppn >> 32) &
> > > > > + APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > > + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > > + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > > + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > > + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > > + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > > + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > > + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > > + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > > + }
> > > > > +#endif
> > > > > +
> > > > > + /* Setup APLIC domaincfg register */
> > > > > + val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > > + val |= APLIC_DOMAINCFG_IE;
> > > > > + if (!priv->nr_idcs)
> > > > > + val |= APLIC_DOMAINCFG_DM;
> > > > > + writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > > + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > > + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > > + priv->fwnode, val);
> > > > > +}
> > > > > +
> > > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > > +{
> > > > > + unsigned int group_index, hart_index, guest_index, val;
> > > > > + struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > > + phys_addr_t tppn, tbppn, msg_addr;
> > > > > + void __iomem *target;
> > > > > +
> > > > > + /* For zeroed MSI, simply write zero into the target register */
> > > > > + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > + writel(0, target);
> > > > > + return;
> > > > > + }
> > > > > +
> > > > > + /* Sanity check on message data */
> > > > > + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > > +
> > > > > + /* Compute target MSI address */
> > > > > + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > > + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > +
> > > > > + /* Compute target HART Base PPN */
> > > > > + tbppn = tppn;
> > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > + WARN_ON(tbppn != mc->base_ppn);
> > > > > +
> > > > > + /* Compute target group and hart indexes */
> > > > > + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > > + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > > + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > > + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > > + hart_index |= (group_index << mc->lhxw);
> > > > > + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > > +
> > > > > + /* Compute target guest index */
> > > > > + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > > +
> > > > > + /* Update IRQ TARGET register */
> > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > > + << APLIC_TARGET_HART_IDX_SHIFT;
> > > > > + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > > + << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > > + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > > + writel(val, target);
> > > > > +}
> > > > > +
> > > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > > +{
> > > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > > + const struct imsic_global_config *imsic_global;
> > > > > +
> > > > > + /*
> > > > > + * The APLIC outgoing MSI config registers assume target MSI
> > > > > + * controller to be RISC-V AIA IMSIC controller.
> > > > > + */
> > > > > + imsic_global = imsic_get_global_config();
> > > > > + if (!imsic_global) {
> > > > > + pr_err("%pfwP: IMSIC global config not found\n",
> > > > > + priv->fwnode);
> > > > > + return -ENODEV;
> > > > > + }
> > > > > +
> > > > > + /* Find number of guest index bits (LHXS) */
> > > > > + mc->lhxs = imsic_global->guest_index_bits;
> > > > > + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > > + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > > + priv->fwnode);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + /* Find number of HART index bits (LHXW) */
> > > > > + mc->lhxw = imsic_global->hart_index_bits;
> > > > > + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > > + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > > + priv->fwnode);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + /* Find number of group index bits (HHXW) */
> > > > > + mc->hhxw = imsic_global->group_index_bits;
> > > > > + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > > + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > > + priv->fwnode);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + /* Find first bit position of group index (HHXS) */
> > > > > + mc->hhxs = imsic_global->group_index_shift;
> > > > > + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > > + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > > + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > > + return -EINVAL;
> > > > > + }
> > > > > + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > > + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > > + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > > + priv->fwnode);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + /* Compute PPN base */
> > > > > + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > +
> > > > > + /* Use all possible CPUs as lmask */
> > > > > + cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > > + * which will return highest priority pending interrupt and clear the
> > > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > > + * register return zero value.
> > > > > + */
> > > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > > +{
> > > > > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > > + irq_hw_number_t hw_irq;
> > > > > + int irq;
> > > > > +
> > > > > + chained_irq_enter(chip, desc);
> > > > > +
> > > > > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > > + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > > +
> > > > > + if (unlikely(irq <= 0))
> > > > > + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > > + hw_irq);
> > > > > + else
> > > > > + generic_handle_irq(irq);
> > > > > + }
> > > > > +
> > > > > + chained_irq_exit(chip, desc);
> > > > > +}
> > > > > +
> > > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > > +{
> > > > > + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > > + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > > +
> > > > > + /* Priority must be less than threshold for interrupt triggering */
> > > > > + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > > +
> > > > > + /* Delivery must be set to 1 for interrupt triggering */
> > > > > + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > > +}
> > > > > +
> > > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > > +{
> > > > > + if (aplic_idc_parent_irq)
> > > > > + disable_percpu_irq(aplic_idc_parent_irq);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > > +{
> > > > > + if (aplic_idc_parent_irq)
> > > > > + enable_percpu_irq(aplic_idc_parent_irq,
> > > > > + irq_get_trigger_type(aplic_idc_parent_irq));
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > > +{
> > > > > + int i, j, rc, cpu, setup_count = 0;
> > > > > + struct fwnode_reference_args parent;
> > > > > + struct irq_domain *domain;
> > > > > + unsigned long hartid;
> > > > > + struct aplic_idc *idc;
> > > > > + u32 val;
> > > > > +
> > > > > + /* Setup per-CPU IDC and target CPU mask */
> > > > > + for (i = 0; i < priv->nr_idcs; i++) {
> > > > > + rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > > + "interrupts-extended", "#interrupt-cells",
> > > > > + 0, i, &parent);
> > > > > + if (rc) {
> > > > > + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > > + priv->fwnode, i);
> > > > > + continue;
> > > > > + }
> > > > > +
> > > > > + /*
> > > > > + * Skip interrupts other than external interrupts for
> > > > > + * current privilege level.
> > > > > + */
> > > > > + if (parent.args[0] != RV_IRQ_EXT)
> > > > > + continue;
> > > > > +
> > > > > + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > > + if (rc) {
> > > > > + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > > + priv->fwnode, i);
> > > > > + continue;
> > > > > + }
> > > > > +
> > > > > + cpu = riscv_hartid_to_cpuid(hartid);
> > > > > + if (cpu < 0) {
> > > > > + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > > + priv->fwnode, i);
> > > > > + continue;
> > > > > + }
> > > > > +
> > > > > + cpumask_set_cpu(cpu, &priv->lmask);
> > > > > +
> > > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > + idc->hart_index = i;
> > > > > + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > > + idc->priv = priv;
> > > > > +
> > > > > + aplic_idc_set_delivery(idc, true);
> > > > > +
> > > > > + /*
> > > > > + * Boot cpu might not have APLIC hart_index = 0 so check
> > > > > + * and update target registers of all interrupts.
> > > > > + */
> > > > > + if (cpu == smp_processor_id() && idc->hart_index) {
> > > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > > + for (j = 1; j <= priv->nr_irqs; j++)
> > > > > + writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > > + (j - 1) * sizeof(u32));
> > > > > + }
> > > > > +
> > > > > + setup_count++;
> > > > > + }
> > > > > +
> > > > > + /* Find parent domain and register chained handler */
> > > > > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > > + DOMAIN_BUS_ANY);
> > > > > + if (!aplic_idc_parent_irq && domain) {
> > > > > + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > > + if (aplic_idc_parent_irq) {
> > > > > + irq_set_chained_handler(aplic_idc_parent_irq,
> > > > > + aplic_idc_handle_irq);
> > > > > +
> > > > > + /*
> > > > > + * Setup CPUHP notifier to enable IDC parent
> > > > > + * interrupt on all CPUs
> > > > > + */
> > > > > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > > + "irqchip/riscv/aplic:starting",
> > > > > + aplic_idc_starting_cpu,
> > > > > + aplic_idc_dying_cpu);
> > > > > + }
> > > > > + }
> > > > > +
> > > > > + /* Fail if we were not able to setup IDC for any CPU */
> > > > > + return (setup_count) ? 0 : -ENODEV;
> > > > > +}
> > > > > +
> > > > > +static int aplic_probe(struct platform_device *pdev)
> > > > > +{
> > > > > + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > > + struct fwnode_reference_args parent;
> > > > > + struct aplic_priv *priv;
> > > > > + struct resource *res;
> > > > > + phys_addr_t pa;
> > > > > + int rc;
> > > > > +
> > > > > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > > + if (!priv)
> > > > > + return -ENOMEM;
> > > > > + priv->fwnode = fwnode;
> > > > > +
> > > > > + /* Map the MMIO registers */
> > > > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > + if (!res) {
> > > > > + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > > + if (!priv->regs) {
> > > > > + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > > + return -ENOMEM;
> > > > > + }
> > > > > +
> > > > > + /*
> > > > > + * Find out GSI base number
> > > > > + *
> > > > > + * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > > + * base is always zero for DT.
> > > > > + */
> > > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > > + &priv->gsi_base, 1);
> > > > > + if (rc)
> > > > > + priv->gsi_base = 0;
> > > > > +
> > > > > + /* Find out number of interrupt sources */
> > > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > > + &priv->nr_irqs, 1);
> > > > > + if (rc) {
> > > > > + pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > > + fwnode);
> > > > > + return rc;
> > > > > + }
> > > > > +
> > > > > + /* Setup initial state APLIC interrupts */
> > > > > + aplic_init_hw_irqs(priv);
> > > > > +
> > > > > + /*
> > > > > + * Find out number of IDCs based on parent interrupts
> > > > > + *
> > > > > + * If "msi-parent" property is present then we ignore the
> > > > > + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > > + */
> > > > > + if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > > + while (!fwnode_property_get_reference_args(fwnode,
> > > > > + "interrupts-extended", "#interrupt-cells",
> > > > > + 0, priv->nr_idcs, &parent))
> > > > > + priv->nr_idcs++;
> > > > > + }
> > > > > +
> > > > > + /* Setup IDCs or MSIs based on number of IDCs */
> > > > > + if (priv->nr_idcs)
> > > > > + rc = aplic_setup_idc(priv);
> > > > > + else
> > > > > + rc = aplic_setup_msi(priv);
> > > > > + if (rc) {
> > > > > + pr_err("%pfwP: failed setup %s\n",
> > > > > + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > > + return rc;
> > > > > + }
> > > > > +
> > > > > + /* Setup global config and interrupt delivery */
> > > > > + aplic_init_hw_global(priv);
> > > > > +
> > > > > + /* Create irq domain instance for the APLIC */
> > > > > + if (priv->nr_idcs)
> > > > > + priv->irqdomain = irq_domain_create_linear(
> > > > > + priv->fwnode,
> > > > > + priv->nr_irqs + 1,
> > > > > + &aplic_irqdomain_idc_ops,
> > > > > + priv);
> > > > > + else
> > > > > + priv->irqdomain = platform_msi_create_device_domain(
> > > > > + &pdev->dev,
> > > > > + priv->nr_irqs + 1,
> > > > > + aplic_msi_write_msg,
> > > > > + &aplic_irqdomain_msi_ops,
> > > > > + priv);
> > > > > + if (!priv->irqdomain) {
> > > > > + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > > + return -ENOMEM;
> > > > > + }
> > > > > +
> > > > > + /* Advertise the interrupt controller */
> > > > > + if (priv->nr_idcs) {
> > > > > + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > > + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > > + } else {
> > > > > + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > > + priv->fwnode, priv->nr_irqs, &pa);
> > > > > + }
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static const struct of_device_id aplic_match[] = {
> > > > > + { .compatible = "riscv,aplic" },
> > > > > + {}
> > > > > +};
> > > > > +
> > > > > +static struct platform_driver aplic_driver = {
> > > > > + .driver = {
> > > > > + .name = "riscv-aplic",
> > > > > + .of_match_table = aplic_match,
> > > > > + },
> > > > > + .probe = aplic_probe,
> > > > > +};
> > > > > +builtin_platform_driver(aplic_driver);
> > > > > +
> > > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > > + struct device_node *parent)
> > > > > +{
> > > > > + /*
> > > > > + * The APLIC platform driver needs to be probed early
> > > > > + * so for device tree:
> > > > > + *
> > > > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > > + * provides a hint to the device driver core to probe the
> > > > > + * platform driver early.
> > > > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > > > + * of_irq_init() sets it which prevents creation of
> > > > > + * platform device.
> > > > > + */
> > > > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > > >
> > > > NACK. You are blindly plastering flags without trying to understand
> > > > the real issue and fixing this correctly.
> > > >
> > > > > + of_node_clear_flag(node, OF_POPULATED);
> > > > > + return 0;
> > > > > +}
> > > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > > >
> > > > This macro pretty much skips the entire driver core framework to probe
> > > > and calls init and you are supposed to initialize the device when the
> > > > init function is called.
> > > >
> > > > If you want your device/driver to follow the proper platform driver
> > > > path (which is recommended), then you need to use the
> > > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > > >
> > > > I offered to help you debug this issue and I asked for a dts file that
> > > > corresponds to a board you are testing this on and seeing an issue.
> > > > But you haven't answered my question [1] and are pointing to some
> > > > random commit and blaming it. That commit has no impact on any
> > > > existing devices/drivers.
> > > >
> > > > Hi Marc,
> > > >
> > > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > > is used or until Anup actually works with us to debug the real issue.
> > >
> > > Maybe I misread your previous comment.
> > >
> > > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > > 1) Build qemu-system-riscv64 from latest QEMU master
> > > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> > > APLIC driver at the time of building kernel)
> > > 3) Boot a APLIC-only system on QEMU virt machine
> > > qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> > > -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> > > -kernel ./build-riscv64/arch/riscv/boot/Image \
> > > -append "root=/dev/ram rw console=ttyS0 earlycon" \
> > > -initrd ./rootfs_riscv64.img
> >
> > Unfortunately, I don't have the time to do all that, but I generally
> > don't need to run something to figure out the issue. It's generally
> > fairly obvious once I look at the DT. I'll also lean on you for some
> > debug logs.
>
> The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
> found at:
> https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing
>
> The boot log without FWNODE_BEST_EFFORT flag in APLIC can
> be found at:
> https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing
>
> >
> > Where is the dts file that corresponds to this QEMU run? This is the
> > third time I'm asking for a pointer to a dts file that has this issue,
> > can you point me to it please? I shouldn't have to say this but: put
> > it somewhere and point me to it please. Please don't point me to some
> > git repo and ask me to dig around.
>
> For QEMU virt machine, the DTB is generated at runtime as part of
> virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
> command line option can be found at:
> https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing
>
> >
> > Can you give me details on what supplier is causing the deferred probe
> > that's a problem for you? Any other details you can provide that'll
> > help debug this issue?
>
> FWNODE supplier for APLIC DT node is the OF framework.
>
> >
> > > I hope the above steps help you reproduce the issue. I will certainly
> > > test whatever fix you propose.
> >
> > Do you plan to try the fix I suggested already? The one about using
> > the correct macros?
>
> You mean use IRQCHIP_DECLARE() in the APLIC driver ?
> or something else ?

No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.

-Saravana

2023-06-23 12:10:31

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Fri, Jun 23, 2023 at 2:27 AM Saravana Kannan <[email protected]> wrote:
>
> On Sun, Jun 18, 2023 at 11:13 PM Anup Patel <[email protected]> wrote:
> >
> > On Sat, Jun 17, 2023 at 3:36 AM Saravana Kannan <[email protected]> wrote:
> > >
> > > On Thu, Jun 15, 2023 at 7:01 PM Anup Patel <[email protected]> wrote:
> > > >
> > > > On Fri, Jun 16, 2023 at 12:47 AM Saravana Kannan <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jun 13, 2023 at 8:35 AM Anup Patel <[email protected]> wrote:
> > > > > >
> > > > > > The RISC-V advanced interrupt architecture (AIA) specification defines
> > > > > > a new interrupt controller for managing wired interrupts on a RISC-V
> > > > > > platform. This new interrupt controller is referred to as advanced
> > > > > > platform-level interrupt controller (APLIC) which can forward wired
> > > > > > interrupts to CPUs (or HARTs) as local interrupts OR as message
> > > > > > signaled interrupts.
> > > > > > (For more details refer https://github.com/riscv/riscv-aia)
> > > > > >
> > > > > > This patch adds an irqchip driver for RISC-V APLIC found on RISC-V
> > > > > > platforms.
> > > > > >
> > > > > > Signed-off-by: Anup Patel <[email protected]>
> > > > > > ---
> > > > > > drivers/irqchip/Kconfig | 6 +
> > > > > > drivers/irqchip/Makefile | 1 +
> > > > > > drivers/irqchip/irq-riscv-aplic.c | 765 ++++++++++++++++++++++++++++
> > > > > > include/linux/irqchip/riscv-aplic.h | 119 +++++
> > > > > > 4 files changed, 891 insertions(+)
> > > > > > create mode 100644 drivers/irqchip/irq-riscv-aplic.c
> > > > > > create mode 100644 include/linux/irqchip/riscv-aplic.h
> > > > > >
> > > > > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > > > > > index d700980372ef..834c0329f583 100644
> > > > > > --- a/drivers/irqchip/Kconfig
> > > > > > +++ b/drivers/irqchip/Kconfig
> > > > > > @@ -544,6 +544,12 @@ config SIFIVE_PLIC
> > > > > > select IRQ_DOMAIN_HIERARCHY
> > > > > > select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
> > > > > >
> > > > > > +config RISCV_APLIC
> > > > > > + bool
> > > > > > + depends on RISCV
> > > > > > + select IRQ_DOMAIN_HIERARCHY
> > > > > > + select GENERIC_MSI_IRQ
> > > > > > +
> > > > > > config RISCV_IMSIC
> > > > > > bool
> > > > > > depends on RISCV
> > > > > > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> > > > > > index 577bde3e986b..438b8e1a152c 100644
> > > > > > --- a/drivers/irqchip/Makefile
> > > > > > +++ b/drivers/irqchip/Makefile
> > > > > > @@ -95,6 +95,7 @@ obj-$(CONFIG_QCOM_MPM) += irq-qcom-mpm.o
> > > > > > obj-$(CONFIG_CSKY_MPINTC) += irq-csky-mpintc.o
> > > > > > obj-$(CONFIG_CSKY_APB_INTC) += irq-csky-apb-intc.o
> > > > > > obj-$(CONFIG_RISCV_INTC) += irq-riscv-intc.o
> > > > > > +obj-$(CONFIG_RISCV_APLIC) += irq-riscv-aplic.o
> > > > > > obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic.o
> > > > > > obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
> > > > > > obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
> > > > > > diff --git a/drivers/irqchip/irq-riscv-aplic.c b/drivers/irqchip/irq-riscv-aplic.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..1e710fdf5608
> > > > > > --- /dev/null
> > > > > > +++ b/drivers/irqchip/irq-riscv-aplic.c
> > > > > > @@ -0,0 +1,765 @@
> > > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > > +/*
> > > > > > + * Copyright (C) 2021 Western Digital Corporation or its affiliates.
> > > > > > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > > > > > + */
> > > > > > +
> > > > > > +#define pr_fmt(fmt) "riscv-aplic: " fmt
> > > > > > +#include <linux/bitops.h>
> > > > > > +#include <linux/cpu.h>
> > > > > > +#include <linux/interrupt.h>
> > > > > > +#include <linux/io.h>
> > > > > > +#include <linux/irq.h>
> > > > > > +#include <linux/irqchip.h>
> > > > > > +#include <linux/irqchip/chained_irq.h>
> > > > > > +#include <linux/irqchip/riscv-aplic.h>
> > > > > > +#include <linux/irqchip/riscv-imsic.h>
> > > > > > +#include <linux/irqdomain.h>
> > > > > > +#include <linux/module.h>
> > > > > > +#include <linux/msi.h>
> > > > > > +#include <linux/platform_device.h>
> > > > > > +#include <linux/smp.h>
> > > > > > +
> > > > > > +#define APLIC_DEFAULT_PRIORITY 1
> > > > > > +#define APLIC_DISABLE_IDELIVERY 0
> > > > > > +#define APLIC_ENABLE_IDELIVERY 1
> > > > > > +#define APLIC_DISABLE_ITHRESHOLD 1
> > > > > > +#define APLIC_ENABLE_ITHRESHOLD 0
> > > > > > +
> > > > > > +struct aplic_msicfg {
> > > > > > + phys_addr_t base_ppn;
> > > > > > + u32 hhxs;
> > > > > > + u32 hhxw;
> > > > > > + u32 lhxs;
> > > > > > + u32 lhxw;
> > > > > > +};
> > > > > > +
> > > > > > +struct aplic_idc {
> > > > > > + unsigned int hart_index;
> > > > > > + void __iomem *regs;
> > > > > > + struct aplic_priv *priv;
> > > > > > +};
> > > > > > +
> > > > > > +struct aplic_priv {
> > > > > > + struct fwnode_handle *fwnode;
> > > > > > + u32 gsi_base;
> > > > > > + u32 nr_irqs;
> > > > > > + u32 nr_idcs;
> > > > > > + void __iomem *regs;
> > > > > > + struct irq_domain *irqdomain;
> > > > > > + struct aplic_msicfg msicfg;
> > > > > > + struct cpumask lmask;
> > > > > > +};
> > > > > > +
> > > > > > +static unsigned int aplic_idc_parent_irq;
> > > > > > +static DEFINE_PER_CPU(struct aplic_idc, aplic_idcs);
> > > > > > +
> > > > > > +static void aplic_irq_unmask(struct irq_data *d)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > + writel(d->hwirq, priv->regs + APLIC_SETIENUM);
> > > > > > +
> > > > > > + if (!priv->nr_idcs)
> > > > > > + irq_chip_unmask_parent(d);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_irq_mask(struct irq_data *d)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > + writel(d->hwirq, priv->regs + APLIC_CLRIENUM);
> > > > > > +
> > > > > > + if (!priv->nr_idcs)
> > > > > > + irq_chip_mask_parent(d);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_set_type(struct irq_data *d, unsigned int type)
> > > > > > +{
> > > > > > + u32 val = 0;
> > > > > > + void __iomem *sourcecfg;
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > +
> > > > > > + switch (type) {
> > > > > > + case IRQ_TYPE_NONE:
> > > > > > + val = APLIC_SOURCECFG_SM_INACTIVE;
> > > > > > + break;
> > > > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > > > + val = APLIC_SOURCECFG_SM_LEVEL_LOW;
> > > > > > + break;
> > > > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > > > + val = APLIC_SOURCECFG_SM_LEVEL_HIGH;
> > > > > > + break;
> > > > > > + case IRQ_TYPE_EDGE_FALLING:
> > > > > > + val = APLIC_SOURCECFG_SM_EDGE_FALL;
> > > > > > + break;
> > > > > > + case IRQ_TYPE_EDGE_RISING:
> > > > > > + val = APLIC_SOURCECFG_SM_EDGE_RISE;
> > > > > > + break;
> > > > > > + default:
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > +
> > > > > > + sourcecfg = priv->regs + APLIC_SOURCECFG_BASE;
> > > > > > + sourcecfg += (d->hwirq - 1) * sizeof(u32);
> > > > > > + writel(val, sourcecfg);
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_irq_eoi(struct irq_data *d)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > + u32 reg_off, reg_mask;
> > > > > > +
> > > > > > + /*
> > > > > > + * EOI handling only required only for level-triggered
> > > > > > + * interrupts in APLIC MSI mode.
> > > > > > + */
> > > > > > +
> > > > > > + if (priv->nr_idcs)
> > > > > > + return;
> > > > > > +
> > > > > > + reg_off = APLIC_CLRIP_BASE + ((d->hwirq / APLIC_IRQBITS_PER_REG) * 4);
> > > > > > + reg_mask = BIT(d->hwirq % APLIC_IRQBITS_PER_REG);
> > > > > > + switch (irqd_get_trigger_type(d)) {
> > > > > > + case IRQ_TYPE_LEVEL_LOW:
> > > > > > + if (!(readl(priv->regs + reg_off) & reg_mask))
> > > > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > > + break;
> > > > > > + case IRQ_TYPE_LEVEL_HIGH:
> > > > > > + if (readl(priv->regs + reg_off) & reg_mask)
> > > > > > + writel(d->hwirq, priv->regs + APLIC_SETIPNUM_LE);
> > > > > > + break;
> > > > > > + }
> > > > > > +}
> > > > > > +
> > > > > > +#ifdef CONFIG_SMP
> > > > > > +static int aplic_set_affinity(struct irq_data *d,
> > > > > > + const struct cpumask *mask_val, bool force)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > + struct aplic_idc *idc;
> > > > > > + unsigned int cpu, val;
> > > > > > + struct cpumask amask;
> > > > > > + void __iomem *target;
> > > > > > +
> > > > > > + if (!priv->nr_idcs)
> > > > > > + return irq_chip_set_affinity_parent(d, mask_val, force);
> > > > > > +
> > > > > > + cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > > +
> > > > > > + if (force)
> > > > > > + cpu = cpumask_first(&amask);
> > > > > > + else
> > > > > > + cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > > +
> > > > > > + if (cpu >= nr_cpu_ids)
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > > > + writel(val, target);
> > > > > > +
> > > > > > + irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > > +
> > > > > > + return IRQ_SET_MASK_OK_DONE;
> > > > > > +}
> > > > > > +#endif
> > > > > > +
> > > > > > +static struct irq_chip aplic_chip = {
> > > > > > + .name = "RISC-V APLIC",
> > > > > > + .irq_mask = aplic_irq_mask,
> > > > > > + .irq_unmask = aplic_irq_unmask,
> > > > > > + .irq_set_type = aplic_set_type,
> > > > > > + .irq_eoi = aplic_irq_eoi,
> > > > > > +#ifdef CONFIG_SMP
> > > > > > + .irq_set_affinity = aplic_set_affinity,
> > > > > > +#endif
> > > > > > + .flags = IRQCHIP_SET_TYPE_MASKED |
> > > > > > + IRQCHIP_SKIP_SET_WAKE |
> > > > > > + IRQCHIP_MASK_ON_SUSPEND,
> > > > > > +};
> > > > > > +
> > > > > > +static int aplic_irqdomain_translate(struct irq_fwspec *fwspec,
> > > > > > + u32 gsi_base,
> > > > > > + unsigned long *hwirq,
> > > > > > + unsigned int *type)
> > > > > > +{
> > > > > > + if (WARN_ON(fwspec->param_count < 2))
> > > > > > + return -EINVAL;
> > > > > > + if (WARN_ON(!fwspec->param[0]))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + /* For DT, gsi_base is always zero. */
> > > > > > + *hwirq = fwspec->param[0] - gsi_base;
> > > > > > + *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
> > > > > > +
> > > > > > + WARN_ON(*type == IRQ_TYPE_NONE);
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_msi_translate(struct irq_domain *d,
> > > > > > + struct irq_fwspec *fwspec,
> > > > > > + unsigned long *hwirq,
> > > > > > + unsigned int *type)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = platform_msi_get_host_data(d);
> > > > > > +
> > > > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_msi_alloc(struct irq_domain *domain,
> > > > > > + unsigned int virq, unsigned int nr_irqs,
> > > > > > + void *arg)
> > > > > > +{
> > > > > > + int i, ret;
> > > > > > + unsigned int type;
> > > > > > + irq_hw_number_t hwirq;
> > > > > > + struct irq_fwspec *fwspec = arg;
> > > > > > + struct aplic_priv *priv = platform_msi_get_host_data(domain);
> > > > > > +
> > > > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + ret = platform_msi_device_domain_alloc(domain, virq, nr_irqs);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + for (i = 0; i < nr_irqs; i++) {
> > > > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > > > + NULL, NULL);
> > > > > > + /*
> > > > > > + * APLIC does not implement irq_disable() so Linux interrupt
> > > > > > + * subsystem will take a lazy approach for disabling an APLIC
> > > > > > + * interrupt. This means APLIC interrupts are left unmasked
> > > > > > + * upon system suspend and interrupts are not processed
> > > > > > + * immediately upon system wake up. To tackle this, we disable
> > > > > > + * the lazy approach for all APLIC interrupts.
> > > > > > + */
> > > > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > > + }
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct irq_domain_ops aplic_irqdomain_msi_ops = {
> > > > > > + .translate = aplic_irqdomain_msi_translate,
> > > > > > + .alloc = aplic_irqdomain_msi_alloc,
> > > > > > + .free = platform_msi_device_domain_free,
> > > > > > +};
> > > > > > +
> > > > > > +static int aplic_irqdomain_idc_translate(struct irq_domain *d,
> > > > > > + struct irq_fwspec *fwspec,
> > > > > > + unsigned long *hwirq,
> > > > > > + unsigned int *type)
> > > > > > +{
> > > > > > + struct aplic_priv *priv = d->host_data;
> > > > > > +
> > > > > > + return aplic_irqdomain_translate(fwspec, priv->gsi_base, hwirq, type);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_irqdomain_idc_alloc(struct irq_domain *domain,
> > > > > > + unsigned int virq, unsigned int nr_irqs,
> > > > > > + void *arg)
> > > > > > +{
> > > > > > + int i, ret;
> > > > > > + unsigned int type;
> > > > > > + irq_hw_number_t hwirq;
> > > > > > + struct irq_fwspec *fwspec = arg;
> > > > > > + struct aplic_priv *priv = domain->host_data;
> > > > > > +
> > > > > > + ret = aplic_irqdomain_translate(fwspec, priv->gsi_base, &hwirq, &type);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + for (i = 0; i < nr_irqs; i++) {
> > > > > > + irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > > + &aplic_chip, priv, handle_fasteoi_irq,
> > > > > > + NULL, NULL);
> > > > > > + irq_set_affinity(virq + i, &priv->lmask);
> > > > > > + /* See the reason described in aplic_irqdomain_msi_alloc() */
> > > > > > + irq_set_status_flags(virq + i, IRQ_DISABLE_UNLAZY);
> > > > > > + }
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct irq_domain_ops aplic_irqdomain_idc_ops = {
> > > > > > + .translate = aplic_irqdomain_idc_translate,
> > > > > > + .alloc = aplic_irqdomain_idc_alloc,
> > > > > > + .free = irq_domain_free_irqs_top,
> > > > > > +};
> > > > > > +
> > > > > > +static void aplic_init_hw_irqs(struct aplic_priv *priv)
> > > > > > +{
> > > > > > + int i;
> > > > > > +
> > > > > > + /* Disable all interrupts */
> > > > > > + for (i = 0; i <= priv->nr_irqs; i += 32)
> > > > > > + writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > > > > > + (i / 32) * sizeof(u32));
> > > > > > +
> > > > > > + /* Set interrupt type and default priority for all interrupts */
> > > > > > + for (i = 1; i <= priv->nr_irqs; i++) {
> > > > > > + writel(0, priv->regs + APLIC_SOURCECFG_BASE +
> > > > > > + (i - 1) * sizeof(u32));
> > > > > > + writel(APLIC_DEFAULT_PRIORITY,
> > > > > > + priv->regs + APLIC_TARGET_BASE +
> > > > > > + (i - 1) * sizeof(u32));
> > > > > > + }
> > > > > > +
> > > > > > + /* Clear APLIC domaincfg */
> > > > > > + writel(0, priv->regs + APLIC_DOMAINCFG);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_init_hw_global(struct aplic_priv *priv)
> > > > > > +{
> > > > > > + u32 val;
> > > > > > +#ifdef CONFIG_RISCV_M_MODE
> > > > > > + u32 valH;
> > > > > > +
> > > > > > + if (!priv->nr_idcs) {
> > > > > > + val = priv->msicfg.base_ppn;
> > > > > > + valH = (priv->msicfg.base_ppn >> 32) &
> > > > > > + APLIC_xMSICFGADDRH_BAPPN_MASK;
> > > > > > + valH |= (priv->msicfg.lhxw & APLIC_xMSICFGADDRH_LHXW_MASK)
> > > > > > + << APLIC_xMSICFGADDRH_LHXW_SHIFT;
> > > > > > + valH |= (priv->msicfg.hhxw & APLIC_xMSICFGADDRH_HHXW_MASK)
> > > > > > + << APLIC_xMSICFGADDRH_HHXW_SHIFT;
> > > > > > + valH |= (priv->msicfg.lhxs & APLIC_xMSICFGADDRH_LHXS_MASK)
> > > > > > + << APLIC_xMSICFGADDRH_LHXS_SHIFT;
> > > > > > + valH |= (priv->msicfg.hhxs & APLIC_xMSICFGADDRH_HHXS_MASK)
> > > > > > + << APLIC_xMSICFGADDRH_HHXS_SHIFT;
> > > > > > + writel(val, priv->regs + APLIC_xMSICFGADDR);
> > > > > > + writel(valH, priv->regs + APLIC_xMSICFGADDRH);
> > > > > > + }
> > > > > > +#endif
> > > > > > +
> > > > > > + /* Setup APLIC domaincfg register */
> > > > > > + val = readl(priv->regs + APLIC_DOMAINCFG);
> > > > > > + val |= APLIC_DOMAINCFG_IE;
> > > > > > + if (!priv->nr_idcs)
> > > > > > + val |= APLIC_DOMAINCFG_DM;
> > > > > > + writel(val, priv->regs + APLIC_DOMAINCFG);
> > > > > > + if (readl(priv->regs + APLIC_DOMAINCFG) != val)
> > > > > > + pr_warn("%pfwP: unable to write 0x%x in domaincfg\n",
> > > > > > + priv->fwnode, val);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_msi_write_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > > > > +{
> > > > > > + unsigned int group_index, hart_index, guest_index, val;
> > > > > > + struct irq_data *d = irq_get_irq_data(desc->irq);
> > > > > > + struct aplic_priv *priv = irq_data_get_irq_chip_data(d);
> > > > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > > > + phys_addr_t tppn, tbppn, msg_addr;
> > > > > > + void __iomem *target;
> > > > > > +
> > > > > > + /* For zeroed MSI, simply write zero into the target register */
> > > > > > + if (!msg->address_hi && !msg->address_lo && !msg->data) {
> > > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > > + writel(0, target);
> > > > > > + return;
> > > > > > + }
> > > > > > +
> > > > > > + /* Sanity check on message data */
> > > > > > + WARN_ON(msg->data > APLIC_TARGET_EIID_MASK);
> > > > > > +
> > > > > > + /* Compute target MSI address */
> > > > > > + msg_addr = (((u64)msg->address_hi) << 32) | msg->address_lo;
> > > > > > + tppn = msg_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > +
> > > > > > + /* Compute target HART Base PPN */
> > > > > > + tbppn = tppn;
> > > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > > + tbppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > > + WARN_ON(tbppn != mc->base_ppn);
> > > > > > +
> > > > > > + /* Compute target group and hart indexes */
> > > > > > + group_index = (tppn >> APLIC_xMSICFGADDR_PPN_HHX_SHIFT(mc->hhxs)) &
> > > > > > + APLIC_xMSICFGADDR_PPN_HHX_MASK(mc->hhxw);
> > > > > > + hart_index = (tppn >> APLIC_xMSICFGADDR_PPN_LHX_SHIFT(mc->lhxs)) &
> > > > > > + APLIC_xMSICFGADDR_PPN_LHX_MASK(mc->lhxw);
> > > > > > + hart_index |= (group_index << mc->lhxw);
> > > > > > + WARN_ON(hart_index > APLIC_TARGET_HART_IDX_MASK);
> > > > > > +
> > > > > > + /* Compute target guest index */
> > > > > > + guest_index = tppn & APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > + WARN_ON(guest_index > APLIC_TARGET_GUEST_IDX_MASK);
> > > > > > +
> > > > > > + /* Update IRQ TARGET register */
> > > > > > + target = priv->regs + APLIC_TARGET_BASE;
> > > > > > + target += (d->hwirq - 1) * sizeof(u32);
> > > > > > + val = (hart_index & APLIC_TARGET_HART_IDX_MASK)
> > > > > > + << APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > + val |= (guest_index & APLIC_TARGET_GUEST_IDX_MASK)
> > > > > > + << APLIC_TARGET_GUEST_IDX_SHIFT;
> > > > > > + val |= (msg->data & APLIC_TARGET_EIID_MASK);
> > > > > > + writel(val, target);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_setup_msi(struct aplic_priv *priv)
> > > > > > +{
> > > > > > + struct aplic_msicfg *mc = &priv->msicfg;
> > > > > > + const struct imsic_global_config *imsic_global;
> > > > > > +
> > > > > > + /*
> > > > > > + * The APLIC outgoing MSI config registers assume target MSI
> > > > > > + * controller to be RISC-V AIA IMSIC controller.
> > > > > > + */
> > > > > > + imsic_global = imsic_get_global_config();
> > > > > > + if (!imsic_global) {
> > > > > > + pr_err("%pfwP: IMSIC global config not found\n",
> > > > > > + priv->fwnode);
> > > > > > + return -ENODEV;
> > > > > > + }
> > > > > > +
> > > > > > + /* Find number of guest index bits (LHXS) */
> > > > > > + mc->lhxs = imsic_global->guest_index_bits;
> > > > > > + if (APLIC_xMSICFGADDRH_LHXS_MASK < mc->lhxs) {
> > > > > > + pr_err("%pfwP: IMSIC guest index bits big for APLIC LHXS\n",
> > > > > > + priv->fwnode);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > +
> > > > > > + /* Find number of HART index bits (LHXW) */
> > > > > > + mc->lhxw = imsic_global->hart_index_bits;
> > > > > > + if (APLIC_xMSICFGADDRH_LHXW_MASK < mc->lhxw) {
> > > > > > + pr_err("%pfwP: IMSIC hart index bits big for APLIC LHXW\n",
> > > > > > + priv->fwnode);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > +
> > > > > > + /* Find number of group index bits (HHXW) */
> > > > > > + mc->hhxw = imsic_global->group_index_bits;
> > > > > > + if (APLIC_xMSICFGADDRH_HHXW_MASK < mc->hhxw) {
> > > > > > + pr_err("%pfwP: IMSIC group index bits big for APLIC HHXW\n",
> > > > > > + priv->fwnode);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > +
> > > > > > + /* Find first bit position of group index (HHXS) */
> > > > > > + mc->hhxs = imsic_global->group_index_shift;
> > > > > > + if (mc->hhxs < (2 * APLIC_xMSICFGADDR_PPN_SHIFT)) {
> > > > > > + pr_err("%pfwP: IMSIC group index shift should be >= %d\n",
> > > > > > + priv->fwnode, (2 * APLIC_xMSICFGADDR_PPN_SHIFT));
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > + mc->hhxs -= (2 * APLIC_xMSICFGADDR_PPN_SHIFT);
> > > > > > + if (APLIC_xMSICFGADDRH_HHXS_MASK < mc->hhxs) {
> > > > > > + pr_err("%pfwP: IMSIC group index shift big for APLIC HHXS\n",
> > > > > > + priv->fwnode);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > +
> > > > > > + /* Compute PPN base */
> > > > > > + mc->base_ppn = imsic_global->base_addr >> APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HART(mc->lhxs);
> > > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_LHX(mc->lhxw, mc->lhxs);
> > > > > > + mc->base_ppn &= ~APLIC_xMSICFGADDR_PPN_HHX(mc->hhxw, mc->hhxs);
> > > > > > +
> > > > > > + /* Use all possible CPUs as lmask */
> > > > > > + cpumask_copy(&priv->lmask, cpu_possible_mask);
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * To handle an APLIC IDC interrupts, we just read the CLAIMI register
> > > > > > + * which will return highest priority pending interrupt and clear the
> > > > > > + * pending bit of the interrupt. This process is repeated until CLAIMI
> > > > > > + * register return zero value.
> > > > > > + */
> > > > > > +static void aplic_idc_handle_irq(struct irq_desc *desc)
> > > > > > +{
> > > > > > + struct aplic_idc *idc = this_cpu_ptr(&aplic_idcs);
> > > > > > + struct irq_chip *chip = irq_desc_get_chip(desc);
> > > > > > + irq_hw_number_t hw_irq;
> > > > > > + int irq;
> > > > > > +
> > > > > > + chained_irq_enter(chip, desc);
> > > > > > +
> > > > > > + while ((hw_irq = readl(idc->regs + APLIC_IDC_CLAIMI))) {
> > > > > > + hw_irq = hw_irq >> APLIC_IDC_TOPI_ID_SHIFT;
> > > > > > + irq = irq_find_mapping(idc->priv->irqdomain, hw_irq);
> > > > > > +
> > > > > > + if (unlikely(irq <= 0))
> > > > > > + pr_warn_ratelimited("hw_irq %lu mapping not found\n",
> > > > > > + hw_irq);
> > > > > > + else
> > > > > > + generic_handle_irq(irq);
> > > > > > + }
> > > > > > +
> > > > > > + chained_irq_exit(chip, desc);
> > > > > > +}
> > > > > > +
> > > > > > +static void aplic_idc_set_delivery(struct aplic_idc *idc, bool en)
> > > > > > +{
> > > > > > + u32 de = (en) ? APLIC_ENABLE_IDELIVERY : APLIC_DISABLE_IDELIVERY;
> > > > > > + u32 th = (en) ? APLIC_ENABLE_ITHRESHOLD : APLIC_DISABLE_ITHRESHOLD;
> > > > > > +
> > > > > > + /* Priority must be less than threshold for interrupt triggering */
> > > > > > + writel(th, idc->regs + APLIC_IDC_ITHRESHOLD);
> > > > > > +
> > > > > > + /* Delivery must be set to 1 for interrupt triggering */
> > > > > > + writel(de, idc->regs + APLIC_IDC_IDELIVERY);
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_idc_dying_cpu(unsigned int cpu)
> > > > > > +{
> > > > > > + if (aplic_idc_parent_irq)
> > > > > > + disable_percpu_irq(aplic_idc_parent_irq);
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_idc_starting_cpu(unsigned int cpu)
> > > > > > +{
> > > > > > + if (aplic_idc_parent_irq)
> > > > > > + enable_percpu_irq(aplic_idc_parent_irq,
> > > > > > + irq_get_trigger_type(aplic_idc_parent_irq));
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_setup_idc(struct aplic_priv *priv)
> > > > > > +{
> > > > > > + int i, j, rc, cpu, setup_count = 0;
> > > > > > + struct fwnode_reference_args parent;
> > > > > > + struct irq_domain *domain;
> > > > > > + unsigned long hartid;
> > > > > > + struct aplic_idc *idc;
> > > > > > + u32 val;
> > > > > > +
> > > > > > + /* Setup per-CPU IDC and target CPU mask */
> > > > > > + for (i = 0; i < priv->nr_idcs; i++) {
> > > > > > + rc = fwnode_property_get_reference_args(priv->fwnode,
> > > > > > + "interrupts-extended", "#interrupt-cells",
> > > > > > + 0, i, &parent);
> > > > > > + if (rc) {
> > > > > > + pr_warn("%pfwP: parent irq for IDC%d not found\n",
> > > > > > + priv->fwnode, i);
> > > > > > + continue;
> > > > > > + }
> > > > > > +
> > > > > > + /*
> > > > > > + * Skip interrupts other than external interrupts for
> > > > > > + * current privilege level.
> > > > > > + */
> > > > > > + if (parent.args[0] != RV_IRQ_EXT)
> > > > > > + continue;
> > > > > > +
> > > > > > + rc = riscv_fw_parent_hartid(parent.fwnode, &hartid);
> > > > > > + if (rc) {
> > > > > > + pr_warn("%pfwP: invalid hartid for IDC%d\n",
> > > > > > + priv->fwnode, i);
> > > > > > + continue;
> > > > > > + }
> > > > > > +
> > > > > > + cpu = riscv_hartid_to_cpuid(hartid);
> > > > > > + if (cpu < 0) {
> > > > > > + pr_warn("%pfwP: invalid cpuid for IDC%d\n",
> > > > > > + priv->fwnode, i);
> > > > > > + continue;
> > > > > > + }
> > > > > > +
> > > > > > + cpumask_set_cpu(cpu, &priv->lmask);
> > > > > > +
> > > > > > + idc = per_cpu_ptr(&aplic_idcs, cpu);
> > > > > > + idc->hart_index = i;
> > > > > > + idc->regs = priv->regs + APLIC_IDC_BASE + i * APLIC_IDC_SIZE;
> > > > > > + idc->priv = priv;
> > > > > > +
> > > > > > + aplic_idc_set_delivery(idc, true);
> > > > > > +
> > > > > > + /*
> > > > > > + * Boot cpu might not have APLIC hart_index = 0 so check
> > > > > > + * and update target registers of all interrupts.
> > > > > > + */
> > > > > > + if (cpu == smp_processor_id() && idc->hart_index) {
> > > > > > + val = idc->hart_index & APLIC_TARGET_HART_IDX_MASK;
> > > > > > + val <<= APLIC_TARGET_HART_IDX_SHIFT;
> > > > > > + val |= APLIC_DEFAULT_PRIORITY;
> > > > > > + for (j = 1; j <= priv->nr_irqs; j++)
> > > > > > + writel(val, priv->regs + APLIC_TARGET_BASE +
> > > > > > + (j - 1) * sizeof(u32));
> > > > > > + }
> > > > > > +
> > > > > > + setup_count++;
> > > > > > + }
> > > > > > +
> > > > > > + /* Find parent domain and register chained handler */
> > > > > > + domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
> > > > > > + DOMAIN_BUS_ANY);
> > > > > > + if (!aplic_idc_parent_irq && domain) {
> > > > > > + aplic_idc_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
> > > > > > + if (aplic_idc_parent_irq) {
> > > > > > + irq_set_chained_handler(aplic_idc_parent_irq,
> > > > > > + aplic_idc_handle_irq);
> > > > > > +
> > > > > > + /*
> > > > > > + * Setup CPUHP notifier to enable IDC parent
> > > > > > + * interrupt on all CPUs
> > > > > > + */
> > > > > > + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > > > > > + "irqchip/riscv/aplic:starting",
> > > > > > + aplic_idc_starting_cpu,
> > > > > > + aplic_idc_dying_cpu);
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > + /* Fail if we were not able to setup IDC for any CPU */
> > > > > > + return (setup_count) ? 0 : -ENODEV;
> > > > > > +}
> > > > > > +
> > > > > > +static int aplic_probe(struct platform_device *pdev)
> > > > > > +{
> > > > > > + struct fwnode_handle *fwnode = pdev->dev.fwnode;
> > > > > > + struct fwnode_reference_args parent;
> > > > > > + struct aplic_priv *priv;
> > > > > > + struct resource *res;
> > > > > > + phys_addr_t pa;
> > > > > > + int rc;
> > > > > > +
> > > > > > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > > > + if (!priv)
> > > > > > + return -ENOMEM;
> > > > > > + priv->fwnode = fwnode;
> > > > > > +
> > > > > > + /* Map the MMIO registers */
> > > > > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > > + if (!res) {
> > > > > > + pr_err("%pfwP: failed to get MMIO resource\n", fwnode);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > + priv->regs = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> > > > > > + if (!priv->regs) {
> > > > > > + pr_err("%pfwP: failed map MMIO registers\n", fwnode);
> > > > > > + return -ENOMEM;
> > > > > > + }
> > > > > > +
> > > > > > + /*
> > > > > > + * Find out GSI base number
> > > > > > + *
> > > > > > + * Note: DT does not define "riscv,gsi-base" property so GSI
> > > > > > + * base is always zero for DT.
> > > > > > + */
> > > > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,gsi-base",
> > > > > > + &priv->gsi_base, 1);
> > > > > > + if (rc)
> > > > > > + priv->gsi_base = 0;
> > > > > > +
> > > > > > + /* Find out number of interrupt sources */
> > > > > > + rc = fwnode_property_read_u32_array(fwnode, "riscv,num-sources",
> > > > > > + &priv->nr_irqs, 1);
> > > > > > + if (rc) {
> > > > > > + pr_err("%pfwP: failed to get number of interrupt sources\n",
> > > > > > + fwnode);
> > > > > > + return rc;
> > > > > > + }
> > > > > > +
> > > > > > + /* Setup initial state APLIC interrupts */
> > > > > > + aplic_init_hw_irqs(priv);
> > > > > > +
> > > > > > + /*
> > > > > > + * Find out number of IDCs based on parent interrupts
> > > > > > + *
> > > > > > + * If "msi-parent" property is present then we ignore the
> > > > > > + * APLIC IDCs which forces the APLIC driver to use MSI mode.
> > > > > > + */
> > > > > > + if (!fwnode_property_present(fwnode, "msi-parent")) {
> > > > > > + while (!fwnode_property_get_reference_args(fwnode,
> > > > > > + "interrupts-extended", "#interrupt-cells",
> > > > > > + 0, priv->nr_idcs, &parent))
> > > > > > + priv->nr_idcs++;
> > > > > > + }
> > > > > > +
> > > > > > + /* Setup IDCs or MSIs based on number of IDCs */
> > > > > > + if (priv->nr_idcs)
> > > > > > + rc = aplic_setup_idc(priv);
> > > > > > + else
> > > > > > + rc = aplic_setup_msi(priv);
> > > > > > + if (rc) {
> > > > > > + pr_err("%pfwP: failed setup %s\n",
> > > > > > + fwnode, priv->nr_idcs ? "IDCs" : "MSIs");
> > > > > > + return rc;
> > > > > > + }
> > > > > > +
> > > > > > + /* Setup global config and interrupt delivery */
> > > > > > + aplic_init_hw_global(priv);
> > > > > > +
> > > > > > + /* Create irq domain instance for the APLIC */
> > > > > > + if (priv->nr_idcs)
> > > > > > + priv->irqdomain = irq_domain_create_linear(
> > > > > > + priv->fwnode,
> > > > > > + priv->nr_irqs + 1,
> > > > > > + &aplic_irqdomain_idc_ops,
> > > > > > + priv);
> > > > > > + else
> > > > > > + priv->irqdomain = platform_msi_create_device_domain(
> > > > > > + &pdev->dev,
> > > > > > + priv->nr_irqs + 1,
> > > > > > + aplic_msi_write_msg,
> > > > > > + &aplic_irqdomain_msi_ops,
> > > > > > + priv);
> > > > > > + if (!priv->irqdomain) {
> > > > > > + pr_err("%pfwP: failed to add irq domain\n", priv->fwnode);
> > > > > > + return -ENOMEM;
> > > > > > + }
> > > > > > +
> > > > > > + /* Advertise the interrupt controller */
> > > > > > + if (priv->nr_idcs) {
> > > > > > + pr_info("%pfwP: %d interrupts directly connected to %d CPUs\n",
> > > > > > + priv->fwnode, priv->nr_irqs, priv->nr_idcs);
> > > > > > + } else {
> > > > > > + pa = priv->msicfg.base_ppn << APLIC_xMSICFGADDR_PPN_SHIFT;
> > > > > > + pr_info("%pfwP: %d interrupts forwared to MSI base %pa\n",
> > > > > > + priv->fwnode, priv->nr_irqs, &pa);
> > > > > > + }
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static const struct of_device_id aplic_match[] = {
> > > > > > + { .compatible = "riscv,aplic" },
> > > > > > + {}
> > > > > > +};
> > > > > > +
> > > > > > +static struct platform_driver aplic_driver = {
> > > > > > + .driver = {
> > > > > > + .name = "riscv-aplic",
> > > > > > + .of_match_table = aplic_match,
> > > > > > + },
> > > > > > + .probe = aplic_probe,
> > > > > > +};
> > > > > > +builtin_platform_driver(aplic_driver);
> > > > > > +
> > > > > > +static int __init aplic_dt_init(struct device_node *node,
> > > > > > + struct device_node *parent)
> > > > > > +{
> > > > > > + /*
> > > > > > + * The APLIC platform driver needs to be probed early
> > > > > > + * so for device tree:
> > > > > > + *
> > > > > > + * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
> > > > > > + * provides a hint to the device driver core to probe the
> > > > > > + * platform driver early.
> > > > > > + * 2) Clear the OF_POPULATED flag in device_node because
> > > > > > + * of_irq_init() sets it which prevents creation of
> > > > > > + * platform device.
> > > > > > + */
> > > > > > + node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
> > > > >
> > > > > NACK. You are blindly plastering flags without trying to understand
> > > > > the real issue and fixing this correctly.
> > > > >
> > > > > > + of_node_clear_flag(node, OF_POPULATED);
> > > > > > + return 0;
> > > > > > +}
> > > > > > +IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
> > > > >
> > > > > This macro pretty much skips the entire driver core framework to probe
> > > > > and calls init and you are supposed to initialize the device when the
> > > > > init function is called.
> > > > >
> > > > > If you want your device/driver to follow the proper platform driver
> > > > > path (which is recommended), then you need to use the
> > > > > IRQCHIP_PLATFORM_DRIVER_BEGIN() and related macros. Grep for plenty of examples.
> > > > >
> > > > > I offered to help you debug this issue and I asked for a dts file that
> > > > > corresponds to a board you are testing this on and seeing an issue.
> > > > > But you haven't answered my question [1] and are pointing to some
> > > > > random commit and blaming it. That commit has no impact on any
> > > > > existing devices/drivers.
> > > > >
> > > > > Hi Marc,
> > > > >
> > > > > Please consider this patch Nacked as long as FWNODE_FLAG_BEST_EFFORT
> > > > > is used or until Anup actually works with us to debug the real issue.
> > > >
> > > > Maybe I misread your previous comment.
> > > >
> > > > You can easily reproduce the issue on QEMU virt machine for RISC-V:
> > > > 1) Build qemu-system-riscv64 from latest QEMU master
> > > > 2) Build kernel from riscv_aia_v4 branch at https://github.com/avpatel/linux.git
> > > > (Note: make sure you remove the FWNODE_FLAG_BEST_EFFORT flag from
> > > > APLIC driver at the time of building kernel)
> > > > 3) Boot a APLIC-only system on QEMU virt machine
> > > > qemu-system-riscv64 -smp 4 -M virt,aia=aplic -m 1G -nographic \
> > > > -bios opensbi/build/platform/generic/firmware/fw_dynamic.bin \
> > > > -kernel ./build-riscv64/arch/riscv/boot/Image \
> > > > -append "root=/dev/ram rw console=ttyS0 earlycon" \
> > > > -initrd ./rootfs_riscv64.img
> > >
> > > Unfortunately, I don't have the time to do all that, but I generally
> > > don't need to run something to figure out the issue. It's generally
> > > fairly obvious once I look at the DT. I'll also lean on you for some
> > > debug logs.
> >
> > The boot log with FWNODE_BEST_EFFORT flag in APLIC can be
> > found at:
> > https://drive.google.com/file/d/1C-uuHbh6Zk9xkAsfGLfhb_4WighvmQp1/view?usp=sharing
> >
> > The boot log without FWNODE_BEST_EFFORT flag in APLIC can
> > be found at:
> > https://drive.google.com/file/d/12SRdR-2Frv_5O06kbuI_LUJ88khjf_7O/view?usp=sharing
> >
> > >
> > > Where is the dts file that corresponds to this QEMU run? This is the
> > > third time I'm asking for a pointer to a dts file that has this issue,
> > > can you point me to it please? I shouldn't have to say this but: put
> > > it somewhere and point me to it please. Please don't point me to some
> > > git repo and ask me to dig around.
> >
> > For QEMU virt machine, the DTB is generated at runtime as part of
> > virt machine creation. The DTS dumped by QEMU using the "dumpdtb"
> > command line option can be found at:
> > https://drive.google.com/file/d/1EU-exItL1B7EWuoXw4q-Ypocq--5Wvn8/view?usp=sharing
> >
> > >
> > > Can you give me details on what supplier is causing the deferred probe
> > > that's a problem for you? Any other details you can provide that'll
> > > help debug this issue?
> >
> > FWNODE supplier for APLIC DT node is the OF framework.
> >
> > >
> > > > I hope the above steps help you reproduce the issue. I will certainly
> > > > test whatever fix you propose.
> > >
> > > Do you plan to try the fix I suggested already? The one about using
> > > the correct macros?
> >
> > You mean use IRQCHIP_DECLARE() in the APLIC driver ?
> > or something else ?
>
> No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.

I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
macros are not suitable for APLIC driver because we need platform device
pointer in the APLIC probe() to create platform MSI device domain (refer,
platform_msi_create_device_domain()).

Further, I tried setting the "suppress_bind_attrs" flag in "struct
platform_driver
aplic_driver" just like the IRQCHIP_PLATFORM_DRIVER_END() macro
but this did not work.

Regards,
Anup

2023-06-23 13:07:39

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

[here, let me trim all of this nonsense...]

On Fri, 23 Jun 2023 12:47:00 +0100,
Anup Patel <[email protected]> wrote:
> > No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> > instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.
>
> I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
> macros are not suitable for APLIC driver because we need platform device
> pointer in the APLIC probe() to create platform MSI device domain (refer,
> platform_msi_create_device_domain()).

Oh come on. How hard have you tried? Have you even looked at the other
drivers in the tree to see how they solve this insurmountable problem
with a *single* line of code?

pdev = of_find_device_by_node(node);

That's it.

> Further, I tried setting the "suppress_bind_attrs" flag in "struct
> platform_driver aplic_driver" just like the
> IRQCHIP_PLATFORM_DRIVER_END() macro but this did not work.

I'm not sure how relevant this is to the conversation.

M.

--
Without deviation from the norm, progress is not possible.

2023-06-23 14:19:31

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 08/10] irqchip: Add RISC-V advanced PLIC driver

On Fri, Jun 23, 2023 at 6:19 PM Marc Zyngier <[email protected]> wrote:
>
> [here, let me trim all of this nonsense...]
>
> On Fri, 23 Jun 2023 12:47:00 +0100,
> Anup Patel <[email protected]> wrote:
> > > No. My previous email asking you to NOT use IRQCHIP_DECLARE() and
> > > instead use IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros.
> >
> > I tried IRQCHIP_PLATFORM_DRIVER_BEGIN/END() macros but these
> > macros are not suitable for APLIC driver because we need platform device
> > pointer in the APLIC probe() to create platform MSI device domain (refer,
> > platform_msi_create_device_domain()).
>
> Oh come on. How hard have you tried? Have you even looked at the other
> drivers in the tree to see how they solve this insurmountable problem
> with a *single* line of code?
>
> pdev = of_find_device_by_node(node);
>
> That's it.

Please see the below diff. I tried the same thing but still the APLIC does
not get probed without the FWNODE_FLAG_BEST_EFFORT flag. Please
note that the current APLIC driver works unmodified for both DT and ACPI
but using of_find_device_by_node() here breaks ACPI support.

diff --git a/drivers/irqchip/irq-riscv-aplic.c
b/drivers/irqchip/irq-riscv-aplic.c
index 1e710fdf5608..9ae9e7fb905f 100644
--- a/drivers/irqchip/irq-riscv-aplic.c
+++ b/drivers/irqchip/irq-riscv-aplic.c
@@ -17,6 +17,7 @@
#include <linux/irqdomain.h>
#include <linux/module.h>
#include <linux/msi.h>
+#include <linux/of_platform.h>
#include <linux/platform_device.h>
#include <linux/smp.h>

@@ -730,36 +731,12 @@ static int aplic_probe(struct platform_device *pdev)
return 0;
}

-static const struct of_device_id aplic_match[] = {
- { .compatible = "riscv,aplic" },
- {}
-};
-
-static struct platform_driver aplic_driver = {
- .driver = {
- .name = "riscv-aplic",
- .of_match_table = aplic_match,
- },
- .probe = aplic_probe,
-};
-builtin_platform_driver(aplic_driver);
-
-static int __init aplic_dt_init(struct device_node *node,
+static int __init aplic_of_init(struct device_node *dn,
struct device_node *parent)
{
- /*
- * The APLIC platform driver needs to be probed early
- * so for device tree:
- *
- * 1) Set the FWNODE_FLAG_BEST_EFFORT flag in fwnode which
- * provides a hint to the device driver core to probe the
- * platform driver early.
- * 2) Clear the OF_POPULATED flag in device_node because
- * of_irq_init() sets it which prevents creation of
- * platform device.
- */
- node->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT;
- of_node_clear_flag(node, OF_POPULATED);
- return 0;
+ return aplic_probe(of_find_device_by_node(dn));
}
-IRQCHIP_DECLARE(riscv_aplic, "riscv,aplic", aplic_dt_init);
+
+IRQCHIP_PLATFORM_DRIVER_BEGIN(aplic)
+IRQCHIP_MATCH("riscv,aplic", aplic_of_init)
+IRQCHIP_PLATFORM_DRIVER_END(aplic)

>
> > Further, I tried setting the "suppress_bind_attrs" flag in "struct
> > platform_driver aplic_driver" just like the
> > IRQCHIP_PLATFORM_DRIVER_END() macro but this did not work.
>
> I'm not sure how relevant this is to the conversation.

It's relevant because the only difference in the platform_driver
registered by IRQCHIP_PLATFORM_DRIVER_END() and
"struct platform_driver aplic_driver" is the "suppress_bind_attrs" flag.

Unfortunately, setting the "suppress_bind_attrs" flag does not
help as well.

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

Regards,
Anup