2014-11-24 14:35:41

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 00/13] arm64: PCI/MSI: GICv3 ITS support (stacked domain edition)

The GICv3 architecture provides a way to implement support for
MSI/MSI-X using a specific block called the ITS (Interrupt Translation
Service).

The ITS can be accurately described as "page tables for
interrupts". If you think this sounds scary, you're spot on. It uses a
set of opaque memory tables that are manipulated through commands
(software almost never touches the tables directly). In order to make
it slightly easier to digest, the code has been split into (mostly)
logical units.

To make things more fun, this relies on Jiang Liu's stacked domain
patch series as now merged in tip/irq/irqdomain:

- patch 1 imports the new asm-generic/msi.h file into arch/arm64
- patches 2 to 13 are the bulk of the ITS driver.

This has been tested on arm64 with an FVP model, and is based on
tip/irq/irqdomain. The whole thing is available at:

git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/gicv3-its

Unless someone screams murder, I consider this to be ready for merge.

M.

>From v2 [2]:
- rebased on top of the stable version of tip/irq/irqdomain
- use irq_domain_reset_irq_data instead of
irq_domain_set_hwirq_and_chip on the free path
- use pci_msi_mask_irq instead of mask_msi_irq
- use host_data to pass the ITS structure around
- top-level MSI domain is now indentified by the ITS of_node

>From v1 [1]:
- rebased on top of tip/irq/irqdomain
- dropped the arm64-specific implementation of arch_setup_msi_irqs and co.
- reworked the whole ITS/MSI setup to use the new MSI/PCI split

[1]: http://lwn.net/Articles/619788/
[2]: https://lkml.org/lkml/2014/11/18/825

Marc Zyngier (13):
arm64: PCI/MSI: Use asm-generic/msi.h
irqchip: GICv3: Convert to domain hierarchy
irqchip: GICv3: rework redistributor structure
irqchip: GICv3: ITS command queue
irqchip: GICv3: ITS: irqchip implementation
irqchip: GICv3: ITS: LPI allocator
irqchip: GICv3: ITS: tables allocators
irqchip: GICv3: ITS: device allocation and configuration
irqchip: GICv3: ITS: MSI support
irqchip: GICv3: ITS: DT probing and initialization
irqchip: GICv3: ITS: plug ITS init into main GICv3 code
irqchip: GICv3: ITS: enable compilation of the ITS driver
irqchip: GICv3: Binding updates for ITS

Documentation/devicetree/bindings/arm/gic-v3.txt | 39 +
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/Kbuild | 1 +
drivers/irqchip/Kconfig | 5 +
drivers/irqchip/Makefile | 1 +
drivers/irqchip/irq-gic-v3-its.c | 1402 ++++++++++++++++++++++
drivers/irqchip/irq-gic-v3.c | 156 ++-
include/linux/irqchip/arm-gic-v3.h | 128 ++
8 files changed, 1693 insertions(+), 40 deletions(-)
create mode 100644 drivers/irqchip/irq-gic-v3-its.c

--
2.1.3


2014-11-24 14:35:51

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

Now, the bit of code that allow us to use the ITS as a MSI controller.
Both MSI and MSI-X are supported.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 176 +++++++++++++++++++++++++++++++++++++
include/linux/irqchip/arm-gic-v3.h | 6 ++
2 files changed, 182 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d687fd4..532c6df 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -587,12 +587,47 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
return IRQ_SET_MASK_OK_DONE;
}

+static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
+{
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+ struct its_node *its;
+ u64 addr;
+
+ its = its_dev->its;
+ addr = its->phys_base + GITS_TRANSLATER;
+
+ msg->address_lo = addr & ((1UL << 32) - 1);
+ msg->address_hi = addr >> 32;
+ msg->data = its_get_event_id(d);
+}
+
static struct irq_chip its_irq_chip = {
.name = "ITS",
.irq_mask = its_mask_irq,
.irq_unmask = its_unmask_irq,
.irq_eoi = its_eoi_irq,
.irq_set_affinity = its_set_affinity,
+ .irq_compose_msi_msg = its_irq_compose_msi_msg,
+};
+
+static void its_mask_msi_irq(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void its_unmask_msi_irq(struct irq_data *d)
+{
+ pci_msi_unmask_irq(d);
+ irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip its_msi_irq_chip = {
+ .name = "ITS-MSI",
+ .irq_unmask = its_unmask_msi_irq,
+ .irq_mask = its_mask_msi_irq,
+ .irq_eoi = irq_chip_eoi_parent,
+ .irq_write_msi_msg = pci_msi_domain_write_msg,
};

/*
@@ -1055,3 +1090,144 @@ static void its_free_device(struct its_device *its_dev)
kfree(its_dev->itt);
kfree(its_dev);
}
+
+static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
+{
+ int idx;
+
+ idx = find_first_zero_bit(dev->lpi_map, dev->nr_lpis);
+ if (idx == dev->nr_lpis)
+ return -ENOSPC;
+
+ *hwirq = dev->lpi_base + idx;
+ set_bit(idx, dev->lpi_map);
+
+ /* Map the GIC irq ID to the device */
+ its_send_mapvi(dev, *hwirq, idx);
+
+ return 0;
+}
+
+static int its_msi_prepare(struct irq_domain *domain, struct device *dev,
+ int nvec, msi_alloc_info_t *info)
+{
+ struct pci_dev *pdev;
+ struct its_node *its;
+ u32 dev_id;
+ struct its_device *its_dev;
+
+ if (!dev_is_pci(dev))
+ return -EINVAL;
+
+ pdev = to_pci_dev(dev);
+ dev_id = PCI_DEVID(pdev->bus->number, pdev->devfn);
+ its = domain->parent->host_data;
+
+ its_dev = its_find_device(its, dev_id);
+ if (WARN_ON(its_dev))
+ return -EINVAL;
+
+ its_dev = its_create_device(its, dev_id, nvec);
+ if (!its_dev)
+ return -ENOMEM;
+
+ dev_dbg(&pdev->dev, "ITT %d entries, %d bits\n", nvec, ilog2(nvec));
+
+ info->scratchpad[0].ptr = its_dev;
+ info->scratchpad[1].ptr = dev;
+ return 0;
+}
+
+static struct msi_domain_ops its_pci_msi_ops = {
+ .msi_prepare = its_msi_prepare,
+};
+
+static struct msi_domain_info its_pci_msi_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX),
+ .ops = &its_pci_msi_ops,
+ .chip = &its_msi_irq_chip,
+};
+
+static int its_irq_gic_domain_alloc(struct irq_domain *domain,
+ unsigned int virq,
+ irq_hw_number_t hwirq)
+{
+ struct of_phandle_args args;
+
+ args.np = domain->parent->of_node;
+ args.args_count = 3;
+ args.args[0] = GIC_IRQ_TYPE_LPI;
+ args.args[1] = hwirq;
+ args.args[2] = IRQ_TYPE_EDGE_RISING;
+
+ return irq_domain_alloc_irqs_parent(domain, virq, 1, &args);
+}
+
+static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *args)
+{
+ msi_alloc_info_t *info = args;
+ struct its_device *its_dev = info->scratchpad[0].ptr;
+ irq_hw_number_t hwirq;
+ int err;
+ int i;
+
+ for (i = 0; i < nr_irqs; i++) {
+ err = its_alloc_device_irq(its_dev, &hwirq);
+ if (err)
+ return err;
+
+ err = its_irq_gic_domain_alloc(domain, virq + i, hwirq);
+ if (err)
+ return err;
+
+ irq_domain_set_hwirq_and_chip(domain, virq + i,
+ hwirq, &its_irq_chip, its_dev);
+ dev_dbg(info->scratchpad[1].ptr, "ID:%d pID:%d vID:%d\n",
+ (int)(hwirq - its_dev->lpi_base), (int)hwirq, virq + i);
+ }
+
+ return 0;
+}
+
+static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+ int i;
+
+ for (i = 0; i < nr_irqs; i++) {
+ struct irq_data *data = irq_domain_get_irq_data(domain,
+ virq + i);
+ int event = its_get_event_id(data);
+
+ /* Stop the delivery of interrupts */
+ its_send_discard(its_dev, event);
+
+ /* Mark interrupt index as unused */
+ clear_bit(event, its_dev->lpi_map);
+
+ /* Nuke the entry in the domain */
+ irq_domain_reset_irq_data(d);
+ }
+
+ /* If all interrupts have been freed, start mopping the floor */
+ if (bitmap_empty(its_dev->lpi_map, its_dev->nr_lpis)) {
+ its_lpi_free(its_dev->lpi_map,
+ its_dev->lpi_base,
+ its_dev->nr_lpis);
+
+ /* Unmap device/itt */
+ its_send_mapd(its_dev, 0);
+ its_free_device(its_dev);
+ }
+
+ irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops its_domain_ops = {
+ .alloc = its_irq_domain_alloc,
+ .free = its_irq_domain_free,
+};
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 21c9d70..0ed30d7 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -295,6 +295,12 @@

#include <linux/stringify.h>

+/*
+ * We need a value to serve as a irq-type for LPIs. Choose one that will
+ * hopefully pique the interest of the reviewer.
+ */
+#define GIC_IRQ_TYPE_LPI 0xa110c8ed
+
struct rdists {
struct {
void __iomem *rd_base;
--
2.1.3

2014-11-24 14:35:57

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 11/13] irqchip: GICv3: ITS: plug ITS init into main GICv3 code

As the ITS is always a subsystem if GICv3, its probing/init is
driven by the main GICv3 code.

Plug that code in (guarded by a config option).

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 41 ++++++++++++++++++++++++++++++++------
include/linux/irqchip/arm-gic-v3.h | 5 +++++
2 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 43e57da..1a146cc 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -76,9 +76,6 @@ static inline void __iomem *gic_dist_base(struct irq_data *d)
if (d->hwirq <= 1023) /* SPI -> dist_base */
return gic_data.dist_base;

- if (d->hwirq >= 8192)
- BUG(); /* LPI Detected!!! */
-
return NULL;
}

@@ -276,11 +273,11 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
do {
irqnr = gic_read_iar();

- if (likely(irqnr > 15 && irqnr < 1020)) {
+ if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
int err;
err = handle_domain_irq(gic_data.domain, irqnr, regs);
if (err) {
- WARN_ONCE(true, "Unexpected SPI received!\n");
+ WARN_ONCE(true, "Unexpected interrupt received!\n");
gic_write_eoir(irqnr);
}
continue;
@@ -393,6 +390,11 @@ static void gic_cpu_sys_reg_init(void)
gic_write_grpen1(1);
}

+static int gic_dist_supports_lpis(void)
+{
+ return !!(readl_relaxed(gic_data.dist_base + GICD_TYPER) & GICD_TYPER_LPIS);
+}
+
static void gic_cpu_init(void)
{
void __iomem *rbase;
@@ -407,6 +409,10 @@ static void gic_cpu_init(void)

gic_cpu_config(rbase, gic_redist_wait_for_rwp);

+ /* Give LPIs a spin */
+ if (IS_ENABLED(CONFIG_ARM_GIC_V3_ITS) && gic_dist_supports_lpis())
+ its_cpu_init();
+
/* initialise system registers */
gic_cpu_sys_reg_init();
}
@@ -593,12 +599,21 @@ static struct irq_chip gic_chip = {
.irq_set_affinity = gic_set_affinity,
};

+#define GIC_ID_NR (1U << gic_data.rdists.id_bits)
+
static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
irq_hw_number_t hw)
{
/* SGIs are private to the core kernel */
if (hw < 16)
return -EPERM;
+ /* Nothing here */
+ if (hw >= gic_data.irq_nr && hw < 8192)
+ return -EPERM;
+ /* Off limits */
+ if (hw >= GIC_ID_NR)
+ return -EPERM;
+
/* PPIs */
if (hw < 32) {
irq_set_percpu_devid(irq);
@@ -612,7 +627,15 @@ static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
handle_fasteoi_irq, NULL, NULL);
set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
}
- irq_set_chip_data(irq, d->host_data);
+ /* LPIs */
+ if (hw >= 8192 && hw < GIC_ID_NR) {
+ if (!gic_dist_supports_lpis())
+ return -EPERM;
+ irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
+ handle_fasteoi_irq, NULL, NULL);
+ set_irq_flags(irq, IRQF_VALID);
+ }
+
return 0;
}

@@ -633,6 +656,9 @@ static int gic_irq_domain_xlate(struct irq_domain *d,
case 1: /* PPI */
*out_hwirq = intspec[1] + 16;
break;
+ case GIC_IRQ_TYPE_LPI: /* LPI */
+ *out_hwirq = intspec[1];
+ break;
default:
return -EINVAL;
}
@@ -759,6 +785,9 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare

set_handle_irq(gic_handle_irq);

+ if (IS_ENABLED(CONFIG_ARM_GIC_V3_ITS) && gic_dist_supports_lpis())
+ its_init(node, &gic_data.rdists, gic_data.domain);
+
gic_smp_init();
gic_dist_init();
gic_cpu_init();
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 0ed30d7..1e8b0cf 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -318,6 +318,11 @@ static inline void gic_write_eoir(u64 irq)
isb();
}

+struct irq_domain;
+int its_cpu_init(void);
+int its_init(struct device_node *node, struct rdists *rdists,
+ struct irq_domain *domain);
+
#endif

#endif
--
2.1.3

2014-11-24 14:36:03

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 10/13] irqchip: GICv3: ITS: DT probing and initialization

Add the code that probes the ITS from the device tree,
and initialize it.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 169 +++++++++++++++++++++++++++++++++++++++
1 file changed, 169 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 532c6df..e9d1615 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1231,3 +1231,172 @@ static const struct irq_domain_ops its_domain_ops = {
.alloc = its_irq_domain_alloc,
.free = its_irq_domain_free,
};
+
+static int its_probe(struct device_node *node, struct irq_domain *parent)
+{
+ struct resource res;
+ struct its_node *its;
+ void __iomem *its_base;
+ u32 val;
+ u64 baser, tmp;
+ int err;
+
+ err = of_address_to_resource(node, 0, &res);
+ if (err) {
+ pr_warn("%s: no regs?\n", node->full_name);
+ return -ENXIO;
+ }
+
+ its_base = ioremap(res.start, resource_size(&res));
+ if (!its_base) {
+ pr_warn("%s: unable to map registers\n", node->full_name);
+ return -ENOMEM;
+ }
+
+ val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
+ if (val != 0x30 && val != 0x40) {
+ pr_warn("%s: no ITS detected, giving up\n", node->full_name);
+ err = -ENODEV;
+ goto out_unmap;
+ }
+
+ pr_info("ITS: %s\n", node->full_name);
+
+ its = kzalloc(sizeof(*its), GFP_KERNEL);
+ if (!its) {
+ err = -ENOMEM;
+ goto out_unmap;
+ }
+
+ raw_spin_lock_init(&its->lock);
+ INIT_LIST_HEAD(&its->entry);
+ INIT_LIST_HEAD(&its->its_device_list);
+ its->base = its_base;
+ its->phys_base = res.start;
+ its->msi_chip.of_node = node;
+ its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
+
+ its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
+ if (!its->cmd_base) {
+ err = -ENOMEM;
+ goto out_free_its;
+ }
+ its->cmd_write = its->cmd_base;
+
+ err = its_alloc_tables(its);
+ if (err)
+ goto out_free_cmd;
+
+ err = its_alloc_collections(its);
+ if (err)
+ goto out_free_tables;
+
+ baser = (virt_to_phys(its->cmd_base) |
+ GITS_CBASER_WaWb |
+ GITS_CBASER_InnerShareable |
+ (ITS_CMD_QUEUE_SZ / SZ_4K - 1) |
+ GITS_CBASER_VALID);
+
+ writeq_relaxed(baser, its->base + GITS_CBASER);
+ tmp = readq_relaxed(its->base + GITS_CBASER);
+ writeq_relaxed(0, its->base + GITS_CWRITER);
+ writel_relaxed(1, its->base + GITS_CTLR);
+
+ if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
+ pr_info("ITS: using cache flushing for cmd queue\n");
+ its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
+ }
+
+ if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
+ its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
+ if (!its->domain) {
+ err = -ENOMEM;
+ goto out_free_tables;
+ }
+
+ its->domain->parent = parent;
+
+ its->msi_chip.domain = pci_msi_create_irq_domain(node,
+ &its_pci_msi_domain_info,
+ its->domain);
+ if (!its->msi_chip.domain) {
+ err = -ENOMEM;
+ goto out_free_domains;
+ }
+
+ err = of_pci_msi_chip_add(&its->msi_chip);
+ if (err)
+ goto out_free_domains;
+ }
+
+ spin_lock(&its_lock);
+ list_add(&its->entry, &its_nodes);
+ spin_unlock(&its_lock);
+
+ return 0;
+
+out_free_domains:
+ if (its->msi_chip.domain)
+ irq_domain_remove(its->msi_chip.domain);
+ if (its->domain)
+ irq_domain_remove(its->domain);
+out_free_tables:
+ its_free_tables(its);
+out_free_cmd:
+ kfree(its->cmd_base);
+out_free_its:
+ kfree(its);
+out_unmap:
+ iounmap(its_base);
+ pr_err("ITS: failed probing %s (%d)\n", node->full_name, err);
+ return err;
+}
+
+static bool gic_rdists_supports_plpis(void)
+{
+ return !!(readl_relaxed(gic_data_rdist_rd_base() + GICR_TYPER) & GICR_TYPER_PLPIS);
+}
+
+int its_cpu_init(void)
+{
+ if (!gic_rdists_supports_plpis()) {
+ pr_info("CPU%d: LPIs not supported\n", smp_processor_id());
+ return -ENXIO;
+ }
+
+ if (!list_empty(&its_nodes)) {
+ its_cpu_init_lpis();
+ its_cpu_init_collection();
+ }
+
+ return 0;
+}
+
+static struct of_device_id its_device_id[] = {
+ { .compatible = "arm,gic-v3-its", },
+ {},
+};
+
+int its_init(struct device_node *node, struct rdists *rdists,
+ struct irq_domain *parent_domain)
+{
+ struct device_node *np;
+
+ for (np = of_find_matching_node(node, its_device_id); np;
+ np = of_find_matching_node(np, its_device_id)) {
+ its_probe(np, parent_domain);
+ }
+
+ if (list_empty(&its_nodes)) {
+ pr_warn("ITS: No ITS available, not enabling LPIs\n");
+ return -ENXIO;
+ }
+
+ gic_rdists = rdists;
+ gic_root_node = node;
+
+ its_alloc_lpi_tables();
+ its_lpi_init(rdists->id_bits);
+
+ return 0;
+}
--
2.1.3

2014-11-24 14:39:46

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 13/13] irqchip: GICv3: Binding updates for ITS

Add the documentation for the bindings describing the GICv3 ITS.

Signed-off-by: Marc Zyngier <[email protected]>
---
Documentation/devicetree/bindings/arm/gic-v3.txt | 39 ++++++++++++++++++++++++
1 file changed, 39 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/gic-v3.txt b/Documentation/devicetree/bindings/arm/gic-v3.txt
index 33cd05e..ddfade4 100644
--- a/Documentation/devicetree/bindings/arm/gic-v3.txt
+++ b/Documentation/devicetree/bindings/arm/gic-v3.txt
@@ -49,11 +49,29 @@ Optional
occupied by the redistributors. Required if more than one such
region is present.

+Sub-nodes:
+
+GICv3 has one or more Interrupt Translation Services (ITS) that are
+used to route Message Signalled Interrupts (MSI) to the CPUs.
+
+These nodes must have the following properties:
+- compatible : Should at least contain "arm,gic-v3-its".
+- msi-controller : Boolean property. Identifies the node as an MSI controller
+- reg: Specifies the base physical address and size of the ITS
+ registers.
+
+The main GIC node must contain the appropriate #address-cells,
+#size-cells and ranges properties for the reg property of all ITS
+nodes.
+
Examples:

gic: interrupt-controller@2cf00000 {
compatible = "arm,gic-v3";
#interrupt-cells = <3>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
interrupt-controller;
reg = <0x0 0x2f000000 0 0x10000>, // GICD
<0x0 0x2f100000 0 0x200000>, // GICR
@@ -61,11 +79,20 @@ Examples:
<0x0 0x2c010000 0 0x2000>, // GICH
<0x0 0x2c020000 0 0x2000>; // GICV
interrupts = <1 9 4>;
+
+ gic-its@2c200000 {
+ compatible = "arm,gic-v3-its";
+ msi-controller;
+ reg = <0x0 0x2c200000 0 0x200000>;
+ };
};

gic: interrupt-controller@2c010000 {
compatible = "arm,gic-v3";
#interrupt-cells = <3>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
interrupt-controller;
redistributor-stride = <0x0 0x40000>; // 256kB stride
#redistributor-regions = <2>;
@@ -76,4 +103,16 @@ Examples:
<0x0 0x2c060000 0 0x2000>, // GICH
<0x0 0x2c080000 0 0x2000>; // GICV
interrupts = <1 9 4>;
+
+ gic-its@2c200000 {
+ compatible = "arm,gic-v3-its";
+ msi-controller;
+ reg = <0x0 0x2c200000 0 0x200000>;
+ };
+
+ gic-its@2c400000 {
+ compatible = "arm,gic-v3-its";
+ msi-controller;
+ reg = <0x0 0x2c400000 0 0x200000>;
+ };
};
--
2.1.3

2014-11-24 14:40:19

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 06/13] irqchip: GICv3: ITS: LPI allocator

LPIs are the type of interrupts that are used by the ITS. Given
the size of the namespace (anywhere between 16 and 32bit), interrupt
IDs are allocated in chunks of 32.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 103 +++++++++++++++++++++++++++++++++++++++
1 file changed, 103 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d24bebd..4154a16 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -586,3 +586,106 @@ static struct irq_chip its_irq_chip = {
.irq_eoi = its_eoi_irq,
.irq_set_affinity = its_set_affinity,
};
+
+/*
+ * How we allocate LPIs:
+ *
+ * The GIC has id_bits bits for interrupt identifiers. From there, we
+ * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as
+ * we allocate LPIs by chunks of 32, we can shift the whole thing by 5
+ * bits to the right.
+ *
+ * This gives us (((1UL << id_bits) - 8192) >> 5) possible allocations.
+ */
+#define IRQS_PER_CHUNK_SHIFT 5
+#define IRQS_PER_CHUNK (1 << IRQS_PER_CHUNK_SHIFT)
+
+static unsigned long *lpi_bitmap;
+static u32 lpi_chunks;
+static DEFINE_SPINLOCK(lpi_lock);
+
+static int its_lpi_to_chunk(int lpi)
+{
+ return (lpi - 8192) >> IRQS_PER_CHUNK_SHIFT;
+}
+
+static int its_chunk_to_lpi(int chunk)
+{
+ return (chunk << IRQS_PER_CHUNK_SHIFT) + 8192;
+}
+
+static int its_lpi_init(u32 id_bits)
+{
+ lpi_chunks = its_lpi_to_chunk(1UL << id_bits);
+
+ lpi_bitmap = kzalloc(BITS_TO_LONGS(lpi_chunks) * sizeof(long),
+ GFP_KERNEL);
+ if (!lpi_bitmap) {
+ lpi_chunks = 0;
+ return -ENOMEM;
+ }
+
+ pr_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks);
+ return 0;
+}
+
+static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
+{
+ unsigned long *bitmap = NULL;
+ int chunk_id;
+ int nr_chunks;
+ int i;
+
+ nr_chunks = DIV_ROUND_UP(nr_irqs, IRQS_PER_CHUNK);
+
+ spin_lock(&lpi_lock);
+
+ do {
+ chunk_id = bitmap_find_next_zero_area(lpi_bitmap, lpi_chunks,
+ 0, nr_chunks, 0);
+ if (chunk_id < lpi_chunks)
+ break;
+
+ nr_chunks--;
+ } while (nr_chunks > 0);
+
+ if (!nr_chunks)
+ goto out;
+
+ bitmap = kzalloc(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long),
+ GFP_ATOMIC);
+ if (!bitmap)
+ goto out;
+
+ for (i = 0; i < nr_chunks; i++)
+ set_bit(chunk_id + i, lpi_bitmap);
+
+ *base = its_chunk_to_lpi(chunk_id);
+ *nr_ids = nr_chunks * IRQS_PER_CHUNK;
+
+out:
+ spin_unlock(&lpi_lock);
+
+ return bitmap;
+}
+
+static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
+{
+ int lpi;
+
+ spin_lock(&lpi_lock);
+
+ for (lpi = base; lpi < (base + nr_ids); lpi += IRQS_PER_CHUNK) {
+ int chunk = its_lpi_to_chunk(lpi);
+ BUG_ON(chunk > lpi_chunks);
+ if (test_bit(chunk, lpi_bitmap)) {
+ clear_bit(chunk, lpi_bitmap);
+ } else {
+ pr_err("Bad LPI chunk %d\n", chunk);
+ }
+ }
+
+ spin_unlock(&lpi_lock);
+
+ kfree(bitmap);
+}
--
2.1.3

2014-11-24 14:40:44

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 07/13] irqchip: GICv3: ITS: tables allocators

The interrupt translation is driven by a set of tables (device,
ITT, and collection) to be in the end delivered to a CPU. Also,
the redistributors rely on a couple of tables (configuration, and
pending) to deliver the interrupts to the CPUs.

This patch adds the required allocators for these tables.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 292 +++++++++++++++++++++++++++++++++++++++
1 file changed, 292 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 4154a16..03f9831 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -91,6 +91,14 @@ struct its_device {
u32 device_id;
};

+static LIST_HEAD(its_nodes);
+static DEFINE_SPINLOCK(its_lock);
+static struct device_node *gic_root_node;
+static struct rdists *gic_rdists;
+
+#define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist))
+#define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base)
+
/*
* ITS command descriptors - parameters to be encoded in a command
* block.
@@ -689,3 +697,287 @@ static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)

kfree(bitmap);
}
+
+/*
+ * We allocate 64kB for PROPBASE. That gives us at most 64K LPIs to
+ * deal with (one configuration byte per interrupt). PENDBASE has to
+ * be 64kB aligned (one bit per LPI, plus 8192 bits for SPI/PPI/SGI).
+ */
+#define LPI_PROPBASE_SZ SZ_64K
+#define LPI_PENDBASE_SZ (LPI_PROPBASE_SZ / 8 + SZ_1K)
+
+/*
+ * This is how many bits of ID we need, including the useless ones.
+ */
+#define LPI_NRBITS ilog2(LPI_PROPBASE_SZ + SZ_8K)
+
+#define LPI_PROP_DEFAULT_PRIO 0xa0
+
+static int __init its_alloc_lpi_tables(void)
+{
+ phys_addr_t paddr;
+
+ gic_rdists->prop_page = alloc_pages(GFP_NOWAIT,
+ get_order(LPI_PROPBASE_SZ));
+ if (!gic_rdists->prop_page) {
+ pr_err("Failed to allocate PROPBASE\n");
+ return -ENOMEM;
+ }
+
+ paddr = page_to_phys(gic_rdists->prop_page);
+ pr_info("GIC: using LPI property table @%pa\n", &paddr);
+
+ /* Priority 0xa0, Group-1, disabled */
+ memset(page_address(gic_rdists->prop_page),
+ LPI_PROP_DEFAULT_PRIO | LPI_PROP_GROUP1,
+ LPI_PROPBASE_SZ);
+
+ /* Make sure the GIC will observe the written configuration */
+ __flush_dcache_area(page_address(gic_rdists->prop_page), LPI_PROPBASE_SZ);
+
+ return 0;
+}
+
+static const char *its_base_type_string[] = {
+ [GITS_BASER_TYPE_DEVICE] = "Devices",
+ [GITS_BASER_TYPE_VCPU] = "Virtual CPUs",
+ [GITS_BASER_TYPE_CPU] = "Physical CPUs",
+ [GITS_BASER_TYPE_COLLECTION] = "Interrupt Collections",
+ [GITS_BASER_TYPE_RESERVED5] = "Reserved (5)",
+ [GITS_BASER_TYPE_RESERVED6] = "Reserved (6)",
+ [GITS_BASER_TYPE_RESERVED7] = "Reserved (7)",
+};
+
+static void its_free_tables(struct its_node *its)
+{
+ int i;
+
+ for (i = 0; i < GITS_BASER_NR_REGS; i++) {
+ if (its->tables[i]) {
+ free_page((unsigned long)its->tables[i]);
+ its->tables[i] = NULL;
+ }
+ }
+}
+
+static int its_alloc_tables(struct its_node *its)
+{
+ int err;
+ int i;
+ int psz = PAGE_SIZE;
+ u64 shr = GITS_BASER_InnerShareable;
+
+ for (i = 0; i < GITS_BASER_NR_REGS; i++) {
+ u64 val = readq_relaxed(its->base + GITS_BASER + i * 8);
+ u64 type = GITS_BASER_TYPE(val);
+ u64 entry_size = GITS_BASER_ENTRY_SIZE(val);
+ u64 tmp;
+ void *base;
+
+ if (type == GITS_BASER_TYPE_NONE)
+ continue;
+
+ /* We're lazy and only allocate a single page for now */
+ base = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!base) {
+ err = -ENOMEM;
+ goto out_free;
+ }
+
+ its->tables[i] = base;
+
+retry_baser:
+ val = (virt_to_phys(base) |
+ (type << GITS_BASER_TYPE_SHIFT) |
+ ((entry_size - 1) << GITS_BASER_ENTRY_SIZE_SHIFT) |
+ GITS_BASER_WaWb |
+ shr |
+ GITS_BASER_VALID);
+
+ switch (psz) {
+ case SZ_4K:
+ val |= GITS_BASER_PAGE_SIZE_4K;
+ break;
+ case SZ_16K:
+ val |= GITS_BASER_PAGE_SIZE_16K;
+ break;
+ case SZ_64K:
+ val |= GITS_BASER_PAGE_SIZE_64K;
+ break;
+ }
+
+ val |= (PAGE_SIZE / psz) - 1;
+
+ writeq_relaxed(val, its->base + GITS_BASER + i * 8);
+ tmp = readq_relaxed(its->base + GITS_BASER + i * 8);
+
+ if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
+ /*
+ * Shareability didn't stick. Just use
+ * whatever the read reported, which is likely
+ * to be the only thing this redistributor
+ * supports.
+ */
+ shr = tmp & GITS_BASER_SHAREABILITY_MASK;
+ goto retry_baser;
+ }
+
+ if ((val ^ tmp) & GITS_BASER_PAGE_SIZE_MASK) {
+ /*
+ * Page size didn't stick. Let's try a smaller
+ * size and retry. If we reach 4K, then
+ * something is horribly wrong...
+ */
+ switch (psz) {
+ case SZ_16K:
+ psz = SZ_4K;
+ goto retry_baser;
+ case SZ_64K:
+ psz = SZ_16K;
+ goto retry_baser;
+ }
+ }
+
+ if (val != tmp) {
+ pr_err("ITS: %s: GITS_BASER%d doesn't stick: %lx %lx\n",
+ its->msi_chip.of_node->full_name, i,
+ (unsigned long) val, (unsigned long) tmp);
+ err = -ENXIO;
+ goto out_free;
+ }
+
+ pr_info("ITS: allocated %d %s @%lx (psz %dK, shr %d)\n",
+ (int)(PAGE_SIZE / entry_size),
+ its_base_type_string[type],
+ (unsigned long)virt_to_phys(base),
+ psz / SZ_1K, (int)shr >> GITS_BASER_SHAREABILITY_SHIFT);
+ }
+
+ return 0;
+
+out_free:
+ its_free_tables(its);
+
+ return err;
+}
+
+static int its_alloc_collections(struct its_node *its)
+{
+ its->collections = kzalloc(nr_cpu_ids * sizeof(*its->collections),
+ GFP_KERNEL);
+ if (!its->collections)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void its_cpu_init_lpis(void)
+{
+ void __iomem *rbase = gic_data_rdist_rd_base();
+ struct page *pend_page;
+ u64 val, tmp;
+
+ /* If we didn't allocate the pending table yet, do it now */
+ pend_page = gic_data_rdist()->pend_page;
+ if (!pend_page) {
+ phys_addr_t paddr;
+ /*
+ * The pending pages have to be at least 64kB aligned,
+ * hence the 'max(LPI_PENDBASE_SZ, SZ_64K)' below.
+ */
+ pend_page = alloc_pages(GFP_NOWAIT | __GFP_ZERO,
+ get_order(max(LPI_PENDBASE_SZ, SZ_64K)));
+ if (!pend_page) {
+ pr_err("Failed to allocate PENDBASE for CPU%d\n",
+ smp_processor_id());
+ return;
+ }
+
+ /* Make sure the GIC will observe the zero-ed page */
+ __flush_dcache_area(page_address(pend_page), LPI_PENDBASE_SZ);
+
+ paddr = page_to_phys(pend_page);
+ pr_info("CPU%d: using LPI pending table @%pa\n",
+ smp_processor_id(), &paddr);
+ gic_data_rdist()->pend_page = pend_page;
+ }
+
+ /* Disable LPIs */
+ val = readl_relaxed(rbase + GICR_CTLR);
+ val &= ~GICR_CTLR_ENABLE_LPIS;
+ writel_relaxed(val, rbase + GICR_CTLR);
+
+ /*
+ * Make sure any change to the table is observable by the GIC.
+ */
+ dsb(sy);
+
+ /* set PROPBASE */
+ val = (page_to_phys(gic_rdists->prop_page) |
+ GICR_PROPBASER_InnerShareable |
+ GICR_PROPBASER_WaWb |
+ ((LPI_NRBITS - 1) & GICR_PROPBASER_IDBITS_MASK));
+
+ writeq_relaxed(val, rbase + GICR_PROPBASER);
+ tmp = readq_relaxed(rbase + GICR_PROPBASER);
+
+ if ((tmp ^ val) & GICR_PROPBASER_SHAREABILITY_MASK) {
+ pr_info_once("GIC: using cache flushing for LPI property table\n");
+ gic_rdists->flags |= RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING;
+ }
+
+ /* set PENDBASE */
+ val = (page_to_phys(pend_page) |
+ GICR_PROPBASER_InnerShareable |
+ GICR_PROPBASER_WaWb);
+
+ writeq_relaxed(val, rbase + GICR_PENDBASER);
+
+ /* Enable LPIs */
+ val = readl_relaxed(rbase + GICR_CTLR);
+ val |= GICR_CTLR_ENABLE_LPIS;
+ writel_relaxed(val, rbase + GICR_CTLR);
+
+ /* Make sure the GIC has seen the above */
+ dsb(sy);
+}
+
+static void its_cpu_init_collection(void)
+{
+ struct its_node *its;
+ int cpu;
+
+ spin_lock(&its_lock);
+ cpu = smp_processor_id();
+
+ list_for_each_entry(its, &its_nodes, entry) {
+ u64 target;
+
+ /*
+ * We now have to bind each collection to its target
+ * redistributor.
+ */
+ if (readq_relaxed(its->base + GITS_TYPER) & GITS_TYPER_PTA) {
+ /*
+ * This ITS wants the physical address of the
+ * redistributor.
+ */
+ target = gic_data_rdist()->phys_base;
+ } else {
+ /*
+ * This ITS wants a linear CPU number.
+ */
+ target = readq_relaxed(gic_data_rdist_rd_base() + GICR_TYPER);
+ target = GICR_TYPER_CPU_NUMBER(target);
+ }
+
+ /* Perform collection mapping */
+ its->collections[cpu].target_address = target;
+ its->collections[cpu].col_id = cpu;
+
+ its_send_mapc(its, &its->collections[cpu], 1);
+ its_send_invall(its, &its->collections[cpu]);
+ }
+
+ spin_unlock(&its_lock);
+}
--
2.1.3

2014-11-24 14:35:49

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 05/13] irqchip: GICv3: ITS: irqchip implementation

The usual methods that are used to present an irqchip to the rest
of the kernel

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 77 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index a5ab12c..d24bebd 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -40,6 +40,8 @@

#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1 << 0)

+#define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING (1 << 0)
+
/*
* Collection structure - just an ID, and a redistributor address to
* ping. We use one per CPU as a bag of interrupts assigned to this
@@ -509,3 +511,78 @@ static void its_send_invall(struct its_node *its, struct its_collection *col)

its_send_single_command(its, its_build_invall_cmd, &desc);
}
+
+/*
+ * irqchip functions - assumes MSI, mostly.
+ */
+
+static inline u32 its_get_event_id(struct irq_data *d)
+{
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+ return d->hwirq - its_dev->lpi_base;
+}
+
+static void lpi_set_config(struct irq_data *d, bool enable)
+{
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+ irq_hw_number_t hwirq = d->hwirq;
+ u32 id = its_get_event_id(d);
+ u8 *cfg = page_address(gic_rdists->prop_page) + hwirq - 8192;
+
+ if (enable)
+ *cfg |= LPI_PROP_ENABLED;
+ else
+ *cfg &= ~LPI_PROP_ENABLED;
+
+ /*
+ * Make the above write visible to the redistributors.
+ * And yes, we're flushing exactly: One. Single. Byte.
+ * Humpf...
+ */
+ if (gic_rdists->flags & RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING)
+ __flush_dcache_area(cfg, sizeof(*cfg));
+ else
+ dsb(ishst);
+ its_send_inv(its_dev, id);
+}
+
+static void its_mask_irq(struct irq_data *d)
+{
+ lpi_set_config(d, false);
+}
+
+static void its_unmask_irq(struct irq_data *d)
+{
+ lpi_set_config(d, true);
+}
+
+static void its_eoi_irq(struct irq_data *d)
+{
+ gic_write_eoir(d->hwirq);
+}
+
+static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
+ bool force)
+{
+ unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+ struct its_collection *target_col;
+ u32 id = its_get_event_id(d);
+
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ target_col = &its_dev->its->collections[cpu];
+ its_send_movi(its_dev, target_col, id);
+ its_dev->collection = target_col;
+
+ return IRQ_SET_MASK_OK_DONE;
+}
+
+static struct irq_chip its_irq_chip = {
+ .name = "ITS",
+ .irq_mask = its_mask_irq,
+ .irq_unmask = its_unmask_irq,
+ .irq_eoi = its_eoi_irq,
+ .irq_set_affinity = its_set_affinity,
+};
--
2.1.3

2014-11-24 14:41:19

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 12/13] irqchip: GICv3: ITS: enable compilation of the ITS driver

Get the show on the road...

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/Kconfig | 1 +
drivers/irqchip/Kconfig | 4 ++++
drivers/irqchip/Makefile | 1 +
3 files changed, 6 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ac9afde..1f49c288 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -13,6 +13,7 @@ config ARM64
select ARM_GIC
select AUDIT_ARCH_COMPAT_GENERIC
select ARM_GIC_V3
+ select ARM_GIC_V3_ITS if PCI_MSI
select BUILDTIME_EXTABLE_SORT
select CLONE_BACKWARDS
select COMMON_CLK
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 4631685..aaa260b 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -16,6 +16,10 @@ config ARM_GIC_V3
select MULTI_IRQ_HANDLER
select IRQ_DOMAIN_HIERARCHY

+config ARM_GIC_V3_ITS
+ bool
+ select PCI_MSI_IRQ_DOMAIN
+
config ARM_NVIC
bool
select IRQ_DOMAIN
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 173bb5f..ec3621d 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_ARCH_SUNXI) += irq-sunxi-nmi.o
obj-$(CONFIG_ARCH_SPEAR3XX) += spear-shirq.o
obj-$(CONFIG_ARM_GIC) += irq-gic.o irq-gic-common.o
obj-$(CONFIG_ARM_GIC_V3) += irq-gic-v3.o irq-gic-common.o
+obj-$(CONFIG_ARM_GIC_V3_ITS) += irq-gic-v3-its.o
obj-$(CONFIG_ARM_NVIC) += irq-nvic.o
obj-$(CONFIG_ARM_VIC) += irq-vic.o
obj-$(CONFIG_ATMEL_AIC_IRQ) += irq-atmel-aic-common.o irq-atmel-aic.o
--
2.1.3

2014-11-24 14:41:40

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 08/13] irqchip: GICv3: ITS: device allocation and configuration

The ITS has a notion of "device" that can write to it in order to
generate an interrupt.

Conversly, the driver maintains a per-ITS list of devices, together
with their configuration information, and uses this to configure
the HW.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 74 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 03f9831..d687fd4 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -981,3 +981,77 @@ static void its_cpu_init_collection(void)

spin_unlock(&its_lock);
}
+
+static struct its_device *its_find_device(struct its_node *its, u32 dev_id)
+{
+ struct its_device *its_dev = NULL, *tmp;
+
+ raw_spin_lock(&its->lock);
+
+ list_for_each_entry(tmp, &its->its_device_list, entry) {
+ if (tmp->device_id == dev_id) {
+ its_dev = tmp;
+ break;
+ }
+ }
+
+ raw_spin_unlock(&its->lock);
+
+ return its_dev;
+}
+
+static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
+ int nvecs)
+{
+ struct its_device *dev;
+ unsigned long *lpi_map;
+ void *itt;
+ int lpi_base;
+ int nr_lpis;
+ int cpu;
+ int sz;
+
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+ sz = nvecs * its->ite_size;
+ sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
+ itt = kmalloc(sz, GFP_KERNEL);
+ lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
+
+ if (!dev || !itt || !lpi_map) {
+ kfree(dev);
+ kfree(itt);
+ kfree(lpi_map);
+ return NULL;
+ }
+
+ dev->its = its;
+ dev->itt = itt;
+ dev->nr_ites = nvecs;
+ dev->lpi_map = lpi_map;
+ dev->lpi_base = lpi_base;
+ dev->nr_lpis = nr_lpis;
+ dev->device_id = dev_id;
+ INIT_LIST_HEAD(&dev->entry);
+
+ raw_spin_lock(&its->lock);
+ list_add(&dev->entry, &its->its_device_list);
+ raw_spin_unlock(&its->lock);
+
+ /* Bind the device to the first possible CPU */
+ cpu = cpumask_first(cpu_online_mask);
+ dev->collection = &its->collections[cpu];
+
+ /* Map device to its ITT */
+ its_send_mapd(dev, 1);
+
+ return dev;
+}
+
+static void its_free_device(struct its_device *its_dev)
+{
+ raw_spin_lock(&its_dev->its->lock);
+ list_del(&its_dev->entry);
+ raw_spin_unlock(&its_dev->its->lock);
+ kfree(its_dev->itt);
+ kfree(its_dev);
+}
--
2.1.3

2014-11-24 14:42:39

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 01/13] arm64: PCI/MSI: Use asm-generic/msi.h

In order to support CONFIG_GENERIC_MSI_IRQ_DOMAIN, we need to
define msi_alloc_info_t. As the generic version exposed in
asm-generic/msi.h is perfectly convenient, import this file
as asm/msi.h.

Acked-by: Will Deacon <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/Kbuild | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index dc770bd..e315bd8 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -28,6 +28,7 @@ generic-y += local64.h
generic-y += mcs_spinlock.h
generic-y += mman.h
generic-y += msgbuf.h
+generic-y += msi.h
generic-y += mutex.h
generic-y += pci.h
generic-y += pci-bridge.h
--
2.1.3

2014-11-24 14:35:39

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 03/13] irqchip: GICv3: rework redistributor structure

The basic GICv3 driver has almost no use for the redistributor
(other than the basic per-CPU interrupts), but the ITS needs
a lot more from them.

As such, rework the set of data structures. The behaviour of the
GICv3 driver is otherwise unaffected.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 73 +++++++++++++++++++++++---------------
include/linux/irqchip/arm-gic-v3.h | 15 ++++++++
2 files changed, 59 insertions(+), 29 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 4cb355a..43e57da 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -34,20 +34,25 @@
#include "irq-gic-common.h"
#include "irqchip.h"

+struct redist_region {
+ void __iomem *redist_base;
+ phys_addr_t phys_base;
+};
+
struct gic_chip_data {
void __iomem *dist_base;
- void __iomem **redist_base;
- void __iomem * __percpu *rdist;
+ struct redist_region *redist_regions;
+ struct rdists rdists;
struct irq_domain *domain;
u64 redist_stride;
- u32 redist_regions;
+ u32 nr_redist_regions;
unsigned int irq_nr;
};

static struct gic_chip_data gic_data __read_mostly;

-#define gic_data_rdist() (this_cpu_ptr(gic_data.rdist))
-#define gic_data_rdist_rd_base() (*gic_data_rdist())
+#define gic_data_rdist() (this_cpu_ptr(gic_data.rdists.rdist))
+#define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base)
#define gic_data_rdist_sgi_base() (gic_data_rdist_rd_base() + SZ_64K)

/* Our default, arbitrary priority value. Linux only uses one anyway. */
@@ -333,8 +338,8 @@ static int gic_populate_rdist(void)
MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8 |
MPIDR_AFFINITY_LEVEL(mpidr, 0));

- for (i = 0; i < gic_data.redist_regions; i++) {
- void __iomem *ptr = gic_data.redist_base[i];
+ for (i = 0; i < gic_data.nr_redist_regions; i++) {
+ void __iomem *ptr = gic_data.redist_regions[i].redist_base;
u32 reg;

reg = readl_relaxed(ptr + GICR_PIDR2) & GIC_PIDR2_ARCH_MASK;
@@ -347,10 +352,13 @@ static int gic_populate_rdist(void)
do {
typer = readq_relaxed(ptr + GICR_TYPER);
if ((typer >> 32) == aff) {
+ u64 offset = ptr - gic_data.redist_regions[i].redist_base;
gic_data_rdist_rd_base() = ptr;
- pr_info("CPU%d: found redistributor %llx @%p\n",
+ gic_data_rdist()->phys_base = gic_data.redist_regions[i].phys_base + offset;
+ pr_info("CPU%d: found redistributor %llx region %d:%pa\n",
smp_processor_id(),
- (unsigned long long)mpidr, ptr);
+ (unsigned long long)mpidr,
+ i, &gic_data_rdist()->phys_base);
return 0;
}

@@ -673,9 +681,10 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
static int __init gic_of_init(struct device_node *node, struct device_node *parent)
{
void __iomem *dist_base;
- void __iomem **redist_base;
+ struct redist_region *rdist_regs;
u64 redist_stride;
- u32 redist_regions;
+ u32 nr_redist_regions;
+ u32 typer;
u32 reg;
int gic_irqs;
int err;
@@ -696,48 +705,54 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
goto out_unmap_dist;
}

- if (of_property_read_u32(node, "#redistributor-regions", &redist_regions))
- redist_regions = 1;
+ if (of_property_read_u32(node, "#redistributor-regions", &nr_redist_regions))
+ nr_redist_regions = 1;

- redist_base = kzalloc(sizeof(*redist_base) * redist_regions, GFP_KERNEL);
- if (!redist_base) {
+ rdist_regs = kzalloc(sizeof(*rdist_regs) * nr_redist_regions, GFP_KERNEL);
+ if (!rdist_regs) {
err = -ENOMEM;
goto out_unmap_dist;
}

- for (i = 0; i < redist_regions; i++) {
- redist_base[i] = of_iomap(node, 1 + i);
- if (!redist_base[i]) {
+ for (i = 0; i < nr_redist_regions; i++) {
+ struct resource res;
+ int ret;
+
+ ret = of_address_to_resource(node, 1 + i, &res);
+ rdist_regs[i].redist_base = of_iomap(node, 1 + i);
+ if (ret || !rdist_regs[i].redist_base) {
pr_err("%s: couldn't map region %d\n",
node->full_name, i);
err = -ENODEV;
goto out_unmap_rdist;
}
+ rdist_regs[i].phys_base = res.start;
}

if (of_property_read_u64(node, "redistributor-stride", &redist_stride))
redist_stride = 0;

gic_data.dist_base = dist_base;
- gic_data.redist_base = redist_base;
- gic_data.redist_regions = redist_regions;
+ gic_data.redist_regions = rdist_regs;
+ gic_data.nr_redist_regions = nr_redist_regions;
gic_data.redist_stride = redist_stride;

/*
* Find out how many interrupts are supported.
* The GIC only supports up to 1020 interrupt sources (SGI+PPI+SPI)
*/
- gic_irqs = readl_relaxed(gic_data.dist_base + GICD_TYPER) & 0x1f;
- gic_irqs = (gic_irqs + 1) * 32;
+ typer = readl_relaxed(gic_data.dist_base + GICD_TYPER);
+ gic_data.rdists.id_bits = GICD_TYPER_ID_BITS(typer);
+ gic_irqs = GICD_TYPER_IRQS(typer);
if (gic_irqs > 1020)
gic_irqs = 1020;
gic_data.irq_nr = gic_irqs;

gic_data.domain = irq_domain_add_tree(node, &gic_irq_domain_ops,
&gic_data);
- gic_data.rdist = alloc_percpu(typeof(*gic_data.rdist));
+ gic_data.rdists.rdist = alloc_percpu(typeof(*gic_data.rdists.rdist));

- if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdist)) {
+ if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) {
err = -ENOMEM;
goto out_free;
}
@@ -754,12 +769,12 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
out_free:
if (gic_data.domain)
irq_domain_remove(gic_data.domain);
- free_percpu(gic_data.rdist);
+ free_percpu(gic_data.rdists.rdist);
out_unmap_rdist:
- for (i = 0; i < redist_regions; i++)
- if (redist_base[i])
- iounmap(redist_base[i]);
- kfree(redist_base);
+ for (i = 0; i < nr_redist_regions; i++)
+ if (rdist_regs[i].redist_base)
+ iounmap(rdist_regs[i].redist_base);
+ kfree(rdist_regs);
out_unmap_dist:
iounmap(dist_base);
return err;
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 03a4ea3..040615a 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -49,6 +49,10 @@
#define GICD_CTLR_ENABLE_G1A (1U << 1)
#define GICD_CTLR_ENABLE_G1 (1U << 0)

+#define GICD_TYPER_ID_BITS(typer) ((((typer) >> 19) & 0x1f) + 1)
+#define GICD_TYPER_IRQS(typer) ((((typer) & 0x1f) + 1) * 32)
+#define GICD_TYPER_LPIS (1U << 17)
+
#define GICD_IROUTER_SPI_MODE_ONE (0U << 31)
#define GICD_IROUTER_SPI_MODE_ANY (1U << 31)

@@ -189,6 +193,17 @@

#include <linux/stringify.h>

+struct rdists {
+ struct {
+ void __iomem *rd_base;
+ struct page *pend_page;
+ phys_addr_t phys_base;
+ } __percpu *rdist;
+ struct page *prop_page;
+ int id_bits;
+ u64 flags;
+};
+
static inline void gic_write_eoir(u64 irq)
{
asm volatile("msr_s " __stringify(ICC_EOIR1_EL1) ", %0" : : "r" (irq));
--
2.1.3

2014-11-24 14:35:36

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 02/13] irqchip: GICv3: Convert to domain hierarchy

In order to start supporting stacked domains, convert the GICv3
code base to the new domain hierarchy framework, which mostly
amounts to supporting the new alloc/free callbacks.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/Kconfig | 1 +
drivers/irqchip/irq-gic-v3.c | 42 +++++++++++++++++++++++++++++++++++++-----
2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index b21f12f..4631685 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -14,6 +14,7 @@ config ARM_GIC_V3
bool
select IRQ_DOMAIN
select MULTI_IRQ_HANDLER
+ select IRQ_DOMAIN_HIERARCHY

config ARM_NVIC
bool
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index aa17ae8..4cb355a 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -594,14 +594,14 @@ static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
/* PPIs */
if (hw < 32) {
irq_set_percpu_devid(irq);
- irq_set_chip_and_handler(irq, &gic_chip,
- handle_percpu_devid_irq);
+ irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
+ handle_percpu_devid_irq, NULL, NULL);
set_irq_flags(irq, IRQF_VALID | IRQF_NOAUTOEN);
}
/* SPIs */
if (hw >= 32 && hw < gic_data.irq_nr) {
- irq_set_chip_and_handler(irq, &gic_chip,
- handle_fasteoi_irq);
+ irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
+ handle_fasteoi_irq, NULL, NULL);
set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
}
irq_set_chip_data(irq, d->host_data);
@@ -633,9 +633,41 @@ static int gic_irq_domain_xlate(struct irq_domain *d,
return 0;
}

+static int gic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *arg)
+{
+ int i, ret;
+ irq_hw_number_t hwirq;
+ unsigned int type = IRQ_TYPE_NONE;
+ struct of_phandle_args *irq_data = arg;
+
+ ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
+ irq_data->args_count, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++)
+ gic_irq_domain_map(domain, virq + i, hwirq + i);
+
+ return 0;
+}
+
+static void gic_irq_domain_free(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs)
+{
+ int i;
+
+ for (i = 0; i < nr_irqs; i++) {
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq + i);
+ irq_set_handler(virq + i, NULL);
+ irq_domain_reset_irq_data(d);
+ }
+}
+
static const struct irq_domain_ops gic_irq_domain_ops = {
- .map = gic_irq_domain_map,
.xlate = gic_irq_domain_xlate,
+ .alloc = gic_irq_domain_alloc,
+ .free = gic_irq_domain_free,
};

static int __init gic_of_init(struct device_node *node, struct device_node *parent)
--
2.1.3

2014-11-24 14:43:51

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 04/13] irqchip: GICv3: ITS command queue

The ITS is configured through a number commands that the driver
issues to the HW using a memory-based circular buffer.

This patch implements the subset of commands that are required
for Linux.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 511 +++++++++++++++++++++++++++++++++++++
include/linux/irqchip/arm-gic-v3.h | 102 ++++++++
2 files changed, 613 insertions(+)
create mode 100644 drivers/irqchip/irq-gic-v3-its.c

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
new file mode 100644
index 0000000..a5ab12c
--- /dev/null
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -0,0 +1,511 @@
+/*
+ * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
+ * Author: Marc Zyngier <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/log2.h>
+#include <linux/mm.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/percpu.h>
+#include <linux/slab.h>
+
+#include <linux/irqchip/arm-gic-v3.h>
+
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/exception.h>
+
+#include "irqchip.h"
+
+#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1 << 0)
+
+/*
+ * Collection structure - just an ID, and a redistributor address to
+ * ping. We use one per CPU as a bag of interrupts assigned to this
+ * CPU.
+ */
+struct its_collection {
+ u64 target_address;
+ u16 col_id;
+};
+
+/*
+ * The ITS structure - contains most of the infrastructure, with the
+ * msi_controller, the command queue, the collections, and the list of
+ * devices writing to it.
+ */
+struct its_node {
+ raw_spinlock_t lock;
+ struct list_head entry;
+ struct msi_controller msi_chip;
+ struct irq_domain *domain;
+ void __iomem *base;
+ unsigned long phys_base;
+ struct its_cmd_block *cmd_base;
+ struct its_cmd_block *cmd_write;
+ void *tables[GITS_BASER_NR_REGS];
+ struct its_collection *collections;
+ struct list_head its_device_list;
+ u64 flags;
+ u32 ite_size;
+};
+
+#define ITS_ITT_ALIGN SZ_256
+
+/*
+ * The ITS view of a device - belongs to an ITS, a collection, owns an
+ * interrupt translation table, and a list of interrupts.
+ */
+struct its_device {
+ struct list_head entry;
+ struct its_node *its;
+ struct its_collection *collection;
+ void *itt;
+ unsigned long *lpi_map;
+ irq_hw_number_t lpi_base;
+ int nr_lpis;
+ u32 nr_ites;
+ u32 device_id;
+};
+
+/*
+ * ITS command descriptors - parameters to be encoded in a command
+ * block.
+ */
+struct its_cmd_desc {
+ union {
+ struct {
+ struct its_device *dev;
+ u32 event_id;
+ } its_inv_cmd;
+
+ struct {
+ struct its_device *dev;
+ u32 event_id;
+ } its_int_cmd;
+
+ struct {
+ struct its_device *dev;
+ int valid;
+ } its_mapd_cmd;
+
+ struct {
+ struct its_collection *col;
+ int valid;
+ } its_mapc_cmd;
+
+ struct {
+ struct its_device *dev;
+ u32 phys_id;
+ u32 event_id;
+ } its_mapvi_cmd;
+
+ struct {
+ struct its_device *dev;
+ struct its_collection *col;
+ u32 id;
+ } its_movi_cmd;
+
+ struct {
+ struct its_device *dev;
+ u32 event_id;
+ } its_discard_cmd;
+
+ struct {
+ struct its_collection *col;
+ } its_invall_cmd;
+ };
+};
+
+/*
+ * The ITS command block, which is what the ITS actually parses.
+ */
+struct its_cmd_block {
+ u64 raw_cmd[4];
+};
+
+#define ITS_CMD_QUEUE_SZ SZ_64K
+#define ITS_CMD_QUEUE_NR_ENTRIES (ITS_CMD_QUEUE_SZ / sizeof(struct its_cmd_block))
+
+typedef struct its_collection *(*its_cmd_builder_t)(struct its_cmd_block *,
+ struct its_cmd_desc *);
+
+static void its_encode_cmd(struct its_cmd_block *cmd, u8 cmd_nr)
+{
+ cmd->raw_cmd[0] &= ~0xffUL;
+ cmd->raw_cmd[0] |= cmd_nr;
+}
+
+static void its_encode_devid(struct its_cmd_block *cmd, u32 devid)
+{
+ cmd->raw_cmd[0] &= ~(0xffffUL << 32);
+ cmd->raw_cmd[0] |= ((u64)devid) << 32;
+}
+
+static void its_encode_event_id(struct its_cmd_block *cmd, u32 id)
+{
+ cmd->raw_cmd[1] &= ~0xffffffffUL;
+ cmd->raw_cmd[1] |= id;
+}
+
+static void its_encode_phys_id(struct its_cmd_block *cmd, u32 phys_id)
+{
+ cmd->raw_cmd[1] &= 0xffffffffUL;
+ cmd->raw_cmd[1] |= ((u64)phys_id) << 32;
+}
+
+static void its_encode_size(struct its_cmd_block *cmd, u8 size)
+{
+ cmd->raw_cmd[1] &= ~0x1fUL;
+ cmd->raw_cmd[1] |= size & 0x1f;
+}
+
+static void its_encode_itt(struct its_cmd_block *cmd, u64 itt_addr)
+{
+ cmd->raw_cmd[2] &= ~0xffffffffffffUL;
+ cmd->raw_cmd[2] |= itt_addr & 0xffffffffff00UL;
+}
+
+static void its_encode_valid(struct its_cmd_block *cmd, int valid)
+{
+ cmd->raw_cmd[2] &= ~(1UL << 63);
+ cmd->raw_cmd[2] |= ((u64)!!valid) << 63;
+}
+
+static void its_encode_target(struct its_cmd_block *cmd, u64 target_addr)
+{
+ cmd->raw_cmd[2] &= ~(0xffffffffUL << 16);
+ cmd->raw_cmd[2] |= (target_addr & (0xffffffffUL << 16));
+}
+
+static void its_encode_collection(struct its_cmd_block *cmd, u16 col)
+{
+ cmd->raw_cmd[2] &= ~0xffffUL;
+ cmd->raw_cmd[2] |= col;
+}
+
+static inline void its_fixup_cmd(struct its_cmd_block *cmd)
+{
+ /* Let's fixup BE commands */
+ cmd->raw_cmd[0] = cpu_to_le64(cmd->raw_cmd[0]);
+ cmd->raw_cmd[1] = cpu_to_le64(cmd->raw_cmd[1]);
+ cmd->raw_cmd[2] = cpu_to_le64(cmd->raw_cmd[2]);
+ cmd->raw_cmd[3] = cpu_to_le64(cmd->raw_cmd[3]);
+}
+
+static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ unsigned long itt_addr;
+ u8 size = order_base_2(desc->its_mapd_cmd.dev->nr_ites);
+
+ itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
+ itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
+
+ its_encode_cmd(cmd, GITS_CMD_MAPD);
+ its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
+ its_encode_size(cmd, size - 1);
+ its_encode_itt(cmd, itt_addr);
+ its_encode_valid(cmd, desc->its_mapd_cmd.valid);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_mapd_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_mapc_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_MAPC);
+ its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
+ its_encode_target(cmd, desc->its_mapc_cmd.col->target_address);
+ its_encode_valid(cmd, desc->its_mapc_cmd.valid);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_mapc_cmd.col;
+}
+
+static struct its_collection *its_build_mapvi_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_MAPVI);
+ its_encode_devid(cmd, desc->its_mapvi_cmd.dev->device_id);
+ its_encode_event_id(cmd, desc->its_mapvi_cmd.event_id);
+ its_encode_phys_id(cmd, desc->its_mapvi_cmd.phys_id);
+ its_encode_collection(cmd, desc->its_mapvi_cmd.dev->collection->col_id);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_mapvi_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_movi_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_MOVI);
+ its_encode_devid(cmd, desc->its_movi_cmd.dev->device_id);
+ its_encode_event_id(cmd, desc->its_movi_cmd.id);
+ its_encode_collection(cmd, desc->its_movi_cmd.col->col_id);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_movi_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_discard_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_DISCARD);
+ its_encode_devid(cmd, desc->its_discard_cmd.dev->device_id);
+ its_encode_event_id(cmd, desc->its_discard_cmd.event_id);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_discard_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_inv_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_INV);
+ its_encode_devid(cmd, desc->its_inv_cmd.dev->device_id);
+ its_encode_event_id(cmd, desc->its_inv_cmd.event_id);
+
+ its_fixup_cmd(cmd);
+
+ return desc->its_inv_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_invall_cmd(struct its_cmd_block *cmd,
+ struct its_cmd_desc *desc)
+{
+ its_encode_cmd(cmd, GITS_CMD_INVALL);
+ its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
+
+ its_fixup_cmd(cmd);
+
+ return NULL;
+}
+
+static u64 its_cmd_ptr_to_offset(struct its_node *its,
+ struct its_cmd_block *ptr)
+{
+ return (ptr - its->cmd_base) * sizeof(*ptr);
+}
+
+static int its_queue_full(struct its_node *its)
+{
+ int widx;
+ int ridx;
+
+ widx = its->cmd_write - its->cmd_base;
+ ridx = readl_relaxed(its->base + GITS_CREADR) / sizeof(struct its_cmd_block);
+
+ /* This is incredibly unlikely to happen, unless the ITS locks up. */
+ if (((widx + 1) % ITS_CMD_QUEUE_NR_ENTRIES) == ridx)
+ return 1;
+
+ return 0;
+}
+
+static struct its_cmd_block *its_allocate_entry(struct its_node *its)
+{
+ struct its_cmd_block *cmd;
+ u32 count = 1000000; /* 1s! */
+
+ while (its_queue_full(its)) {
+ count--;
+ if (!count) {
+ pr_err_ratelimited("ITS queue not draining\n");
+ return NULL;
+ }
+ cpu_relax();
+ udelay(1);
+ }
+
+ cmd = its->cmd_write++;
+
+ /* Handle queue wrapping */
+ if (its->cmd_write == (its->cmd_base + ITS_CMD_QUEUE_NR_ENTRIES))
+ its->cmd_write = its->cmd_base;
+
+ return cmd;
+}
+
+static struct its_cmd_block *its_post_commands(struct its_node *its)
+{
+ u64 wr = its_cmd_ptr_to_offset(its, its->cmd_write);
+
+ writel_relaxed(wr, its->base + GITS_CWRITER);
+
+ return its->cmd_write;
+}
+
+static void its_flush_cmd(struct its_node *its, struct its_cmd_block *cmd)
+{
+ /*
+ * Make sure the commands written to memory are observable by
+ * the ITS.
+ */
+ if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
+ __flush_dcache_area(cmd, sizeof(*cmd));
+ else
+ dsb(ishst);
+}
+
+static void its_wait_for_range_completion(struct its_node *its,
+ struct its_cmd_block *from,
+ struct its_cmd_block *to)
+{
+ u64 rd_idx, from_idx, to_idx;
+ u32 count = 1000000; /* 1s! */
+
+ from_idx = its_cmd_ptr_to_offset(its, from);
+ to_idx = its_cmd_ptr_to_offset(its, to);
+
+ while (1) {
+ rd_idx = readl_relaxed(its->base + GITS_CREADR);
+ if (rd_idx >= to_idx || rd_idx < from_idx)
+ break;
+
+ count--;
+ if (!count) {
+ pr_err_ratelimited("ITS queue timeout\n");
+ return;
+ }
+ cpu_relax();
+ udelay(1);
+ }
+}
+
+static void its_send_single_command(struct its_node *its,
+ its_cmd_builder_t builder,
+ struct its_cmd_desc *desc)
+{
+ struct its_cmd_block *cmd, *sync_cmd, *next_cmd;
+ struct its_collection *sync_col;
+
+ raw_spin_lock(&its->lock);
+
+ cmd = its_allocate_entry(its);
+ if (!cmd) { /* We're soooooo screewed... */
+ pr_err_ratelimited("ITS can't allocate, dropping command\n");
+ raw_spin_unlock(&its->lock);
+ return;
+ }
+ sync_col = builder(cmd, desc);
+ its_flush_cmd(its, cmd);
+
+ if (sync_col) {
+ sync_cmd = its_allocate_entry(its);
+ if (!sync_cmd) {
+ pr_err_ratelimited("ITS can't SYNC, skipping\n");
+ goto post;
+ }
+ its_encode_cmd(sync_cmd, GITS_CMD_SYNC);
+ its_encode_target(sync_cmd, sync_col->target_address);
+ its_fixup_cmd(sync_cmd);
+ its_flush_cmd(its, sync_cmd);
+ }
+
+post:
+ next_cmd = its_post_commands(its);
+ raw_spin_unlock(&its->lock);
+
+ its_wait_for_range_completion(its, cmd, next_cmd);
+}
+
+static void its_send_inv(struct its_device *dev, u32 event_id)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_inv_cmd.dev = dev;
+ desc.its_inv_cmd.event_id = event_id;
+
+ its_send_single_command(dev->its, its_build_inv_cmd, &desc);
+}
+
+static void its_send_mapd(struct its_device *dev, int valid)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_mapd_cmd.dev = dev;
+ desc.its_mapd_cmd.valid = !!valid;
+
+ its_send_single_command(dev->its, its_build_mapd_cmd, &desc);
+}
+
+static void its_send_mapc(struct its_node *its, struct its_collection *col,
+ int valid)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_mapc_cmd.col = col;
+ desc.its_mapc_cmd.valid = !!valid;
+
+ its_send_single_command(its, its_build_mapc_cmd, &desc);
+}
+
+static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_mapvi_cmd.dev = dev;
+ desc.its_mapvi_cmd.phys_id = irq_id;
+ desc.its_mapvi_cmd.event_id = id;
+
+ its_send_single_command(dev->its, its_build_mapvi_cmd, &desc);
+}
+
+static void its_send_movi(struct its_device *dev,
+ struct its_collection *col, u32 id)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_movi_cmd.dev = dev;
+ desc.its_movi_cmd.col = col;
+ desc.its_movi_cmd.id = id;
+
+ its_send_single_command(dev->its, its_build_movi_cmd, &desc);
+}
+
+static void its_send_discard(struct its_device *dev, u32 id)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_discard_cmd.dev = dev;
+ desc.its_discard_cmd.event_id = id;
+
+ its_send_single_command(dev->its, its_build_discard_cmd, &desc);
+}
+
+static void its_send_invall(struct its_node *its, struct its_collection *col)
+{
+ struct its_cmd_desc desc;
+
+ desc.its_invall_cmd.col = col;
+
+ its_send_single_command(its, its_build_invall_cmd, &desc);
+}
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 040615a..21c9d70 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -80,9 +80,27 @@
#define GICR_MOVALLR 0x0110
#define GICR_PIDR2 GICD_PIDR2

+#define GICR_CTLR_ENABLE_LPIS (1UL << 0)
+
+#define GICR_TYPER_CPU_NUMBER(r) (((r) >> 8) & 0xffff)
+
#define GICR_WAKER_ProcessorSleep (1U << 1)
#define GICR_WAKER_ChildrenAsleep (1U << 2)

+#define GICR_PROPBASER_NonShareable (0U << 10)
+#define GICR_PROPBASER_InnerShareable (1U << 10)
+#define GICR_PROPBASER_OuterShareable (2U << 10)
+#define GICR_PROPBASER_SHAREABILITY_MASK (3UL << 10)
+#define GICR_PROPBASER_nCnB (0U << 7)
+#define GICR_PROPBASER_nC (1U << 7)
+#define GICR_PROPBASER_RaWt (2U << 7)
+#define GICR_PROPBASER_RaWb (3U << 7)
+#define GICR_PROPBASER_WaWt (4U << 7)
+#define GICR_PROPBASER_WaWb (5U << 7)
+#define GICR_PROPBASER_RaWaWt (6U << 7)
+#define GICR_PROPBASER_RaWaWb (7U << 7)
+#define GICR_PROPBASER_IDBITS_MASK (0x1f)
+
/*
* Re-Distributor registers, offsets from SGI_base
*/
@@ -95,9 +113,93 @@
#define GICR_IPRIORITYR0 GICD_IPRIORITYR
#define GICR_ICFGR0 GICD_ICFGR

+#define GICR_TYPER_PLPIS (1U << 0)
#define GICR_TYPER_VLPIS (1U << 1)
#define GICR_TYPER_LAST (1U << 4)

+#define LPI_PROP_GROUP1 (1 << 1)
+#define LPI_PROP_ENABLED (1 << 0)
+
+/*
+ * ITS registers, offsets from ITS_base
+ */
+#define GITS_CTLR 0x0000
+#define GITS_IIDR 0x0004
+#define GITS_TYPER 0x0008
+#define GITS_CBASER 0x0080
+#define GITS_CWRITER 0x0088
+#define GITS_CREADR 0x0090
+#define GITS_BASER 0x0100
+#define GITS_PIDR2 GICR_PIDR2
+
+#define GITS_TRANSLATER 0x10040
+
+#define GITS_TYPER_PTA (1UL << 19)
+
+#define GITS_CBASER_VALID (1UL << 63)
+#define GITS_CBASER_nCnB (0UL << 59)
+#define GITS_CBASER_nC (1UL << 59)
+#define GITS_CBASER_RaWt (2UL << 59)
+#define GITS_CBASER_RaWb (3UL << 59)
+#define GITS_CBASER_WaWt (4UL << 59)
+#define GITS_CBASER_WaWb (5UL << 59)
+#define GITS_CBASER_RaWaWt (6UL << 59)
+#define GITS_CBASER_RaWaWb (7UL << 59)
+#define GITS_CBASER_NonShareable (0UL << 10)
+#define GITS_CBASER_InnerShareable (1UL << 10)
+#define GITS_CBASER_OuterShareable (2UL << 10)
+#define GITS_CBASER_SHAREABILITY_MASK (3UL << 10)
+
+#define GITS_BASER_NR_REGS 8
+
+#define GITS_BASER_VALID (1UL << 63)
+#define GITS_BASER_nCnB (0UL << 59)
+#define GITS_BASER_nC (1UL << 59)
+#define GITS_BASER_RaWt (2UL << 59)
+#define GITS_BASER_RaWb (3UL << 59)
+#define GITS_BASER_WaWt (4UL << 59)
+#define GITS_BASER_WaWb (5UL << 59)
+#define GITS_BASER_RaWaWt (6UL << 59)
+#define GITS_BASER_RaWaWb (7UL << 59)
+#define GITS_BASER_TYPE_SHIFT (56)
+#define GITS_BASER_TYPE(r) (((r) >> GITS_BASER_TYPE_SHIFT) & 7)
+#define GITS_BASER_ENTRY_SIZE_SHIFT (48)
+#define GITS_BASER_ENTRY_SIZE(r) ((((r) >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0xff) + 1)
+#define GITS_BASER_NonShareable (0UL << 10)
+#define GITS_BASER_InnerShareable (1UL << 10)
+#define GITS_BASER_OuterShareable (2UL << 10)
+#define GITS_BASER_SHAREABILITY_SHIFT (10)
+#define GITS_BASER_SHAREABILITY_MASK (3UL << GITS_BASER_SHAREABILITY_SHIFT)
+#define GITS_BASER_PAGE_SIZE_SHIFT (8)
+#define GITS_BASER_PAGE_SIZE_4K (0UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_16K (1UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_64K (2UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_MASK (3UL << GITS_BASER_PAGE_SIZE_SHIFT)
+
+#define GITS_BASER_TYPE_NONE 0
+#define GITS_BASER_TYPE_DEVICE 1
+#define GITS_BASER_TYPE_VCPU 2
+#define GITS_BASER_TYPE_CPU 3
+#define GITS_BASER_TYPE_COLLECTION 4
+#define GITS_BASER_TYPE_RESERVED5 5
+#define GITS_BASER_TYPE_RESERVED6 6
+#define GITS_BASER_TYPE_RESERVED7 7
+
+/*
+ * ITS commands
+ */
+#define GITS_CMD_MAPD 0x08
+#define GITS_CMD_MAPC 0x09
+#define GITS_CMD_MAPVI 0x0a
+#define GITS_CMD_MOVI 0x01
+#define GITS_CMD_DISCARD 0x0f
+#define GITS_CMD_INV 0x0c
+#define GITS_CMD_MOVALL 0x0e
+#define GITS_CMD_INVALL 0x0d
+#define GITS_CMD_INT 0x03
+#define GITS_CMD_CLEAR 0x04
+#define GITS_CMD_SYNC 0x05
+
/*
* CPU interface registers
*/
--
2.1.3

2014-11-24 14:57:57

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH v3 06/13] irqchip: GICv3: ITS: LPI allocator



On 2014/11/24 22:35, Marc Zyngier wrote:
> LPIs are the type of interrupts that are used by the ITS. Given
> the size of the namespace (anywhere between 16 and 32bit), interrupt
> IDs are allocated in chunks of 32.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 103 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 103 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d24bebd..4154a16 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -586,3 +586,106 @@ static struct irq_chip its_irq_chip = {
> .irq_eoi = its_eoi_irq,
> .irq_set_affinity = its_set_affinity,
> };
> +
> +/*
> + * How we allocate LPIs:
> + *
> + * The GIC has id_bits bits for interrupt identifiers. From there, we
> + * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as
> + * we allocate LPIs by chunks of 32, we can shift the whole thing by 5
> + * bits to the right.
Just curious, why 32? sizeof(long) is 4 on ARM64?

2014-11-24 15:32:57

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 06/13] irqchip: GICv3: ITS: LPI allocator

On 24/11/14 14:57, Jiang Liu wrote:
>
>
> On 2014/11/24 22:35, Marc Zyngier wrote:
>> LPIs are the type of interrupts that are used by the ITS. Given
>> the size of the namespace (anywhere between 16 and 32bit), interrupt
>> IDs are allocated in chunks of 32.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 103 +++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 103 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index d24bebd..4154a16 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -586,3 +586,106 @@ static struct irq_chip its_irq_chip = {
>> .irq_eoi = its_eoi_irq,
>> .irq_set_affinity = its_set_affinity,
>> };
>> +
>> +/*
>> + * How we allocate LPIs:
>> + *
>> + * The GIC has id_bits bits for interrupt identifiers. From there, we
>> + * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as
>> + * we allocate LPIs by chunks of 32, we can shift the whole thing by 5
>> + * bits to the right.
> Just curious, why 32? sizeof(long) is 4 on ARM64?

No, sizeof(long) == 8, as on any sane 64bit architecture.

There are two reasons for this:
- the ID space is rather large (at least 16 bits, possibly 32 bits), so
we're trying not to allocate the whole bitmap in one go.
- 32 is the maximum a MSI-capable device can request. Allocating 32
interrupts in one go makes sure that these interrupts are contiguous and
satisfy the MSI requirements.

Hope this helps,

M.
--
Jazz is not dead. It just smells funny...

2014-11-25 21:08:10

by Stuart Yoder

[permalink] [raw]
Subject: Re: [PATCH v3 10/13] irqchip: GICv3: ITS: DT probing and initialization

On Mon, Nov 24, 2014 at 8:35 AM, Marc Zyngier <[email protected]> wrote:
> Add the code that probes the ITS from the device tree,
> and initialize it.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 169 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 169 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 532c6df..e9d1615 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1231,3 +1231,172 @@ static const struct irq_domain_ops its_domain_ops = {
> .alloc = its_irq_domain_alloc,
> .free = its_irq_domain_free,
> };
> +
> +static int its_probe(struct device_node *node, struct irq_domain *parent)
> +{
> + struct resource res;
> + struct its_node *its;
> + void __iomem *its_base;
> + u32 val;
> + u64 baser, tmp;
> + int err;
> +
> + err = of_address_to_resource(node, 0, &res);
> + if (err) {
> + pr_warn("%s: no regs?\n", node->full_name);
> + return -ENXIO;
> + }
> +
> + its_base = ioremap(res.start, resource_size(&res));
> + if (!its_base) {
> + pr_warn("%s: unable to map registers\n", node->full_name);
> + return -ENOMEM;
> + }
> +
> + val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
> + if (val != 0x30 && val != 0x40) {
> + pr_warn("%s: no ITS detected, giving up\n", node->full_name);
> + err = -ENODEV;
> + goto out_unmap;
> + }
> +
> + pr_info("ITS: %s\n", node->full_name);
> +
> + its = kzalloc(sizeof(*its), GFP_KERNEL);
> + if (!its) {
> + err = -ENOMEM;
> + goto out_unmap;
> + }
> +
> + raw_spin_lock_init(&its->lock);
> + INIT_LIST_HEAD(&its->entry);
> + INIT_LIST_HEAD(&its->its_device_list);
> + its->base = its_base;
> + its->phys_base = res.start;
> + its->msi_chip.of_node = node;
> + its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
> +
> + its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
> + if (!its->cmd_base) {
> + err = -ENOMEM;
> + goto out_free_its;
> + }
> + its->cmd_write = its->cmd_base;
> +
> + err = its_alloc_tables(its);
> + if (err)
> + goto out_free_cmd;
> +
> + err = its_alloc_collections(its);
> + if (err)
> + goto out_free_tables;
> +
> + baser = (virt_to_phys(its->cmd_base) |
> + GITS_CBASER_WaWb |
> + GITS_CBASER_InnerShareable |
> + (ITS_CMD_QUEUE_SZ / SZ_4K - 1) |
> + GITS_CBASER_VALID);
> +
> + writeq_relaxed(baser, its->base + GITS_CBASER);
> + tmp = readq_relaxed(its->base + GITS_CBASER);
> + writeq_relaxed(0, its->base + GITS_CWRITER);
> + writel_relaxed(1, its->base + GITS_CTLR);
> +
> + if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
> + pr_info("ITS: using cache flushing for cmd queue\n");
> + its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
> + }
> +
> + if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
> + its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
> + if (!its->domain) {
> + err = -ENOMEM;
> + goto out_free_tables;
> + }
> +
> + its->domain->parent = parent;
> +
> + its->msi_chip.domain = pci_msi_create_irq_domain(node,
> + &its_pci_msi_domain_info,
> + its->domain);
> + if (!its->msi_chip.domain) {
> + err = -ENOMEM;
> + goto out_free_domains;
> + }
> +
> + err = of_pci_msi_chip_add(&its->msi_chip);
> + if (err)
> + goto out_free_domains;
> + }

Hi Marc,

We have a requirement to have both PCI and non-PCI buses use the GIC_ITS.
Above, you have the hardcoded assumption that this is PCI. How do 2 different
bus types share the ITS at the same time.

Thanks,
Stuart Yoder
Freescale

2014-11-26 08:07:30

by Jason Cooper

[permalink] [raw]
Subject: Re: [PATCH v3 00/13] arm64: PCI/MSI: GICv3 ITS support (stacked domain edition)

Marc,

On Mon, Nov 24, 2014 at 02:35:07PM +0000, Marc Zyngier wrote:
> The GICv3 architecture provides a way to implement support for
> MSI/MSI-X using a specific block called the ITS (Interrupt Translation
> Service).
>
> The ITS can be accurately described as "page tables for
> interrupts". If you think this sounds scary, you're spot on. It uses a
> set of opaque memory tables that are manipulated through commands
> (software almost never touches the tables directly). In order to make
> it slightly easier to digest, the code has been split into (mostly)
> logical units.
>
> To make things more fun, this relies on Jiang Liu's stacked domain
> patch series as now merged in tip/irq/irqdomain:
>
> - patch 1 imports the new asm-generic/msi.h file into arch/arm64
> - patches 2 to 13 are the bulk of the ITS driver.
>
> This has been tested on arm64 with an FVP model, and is based on
> tip/irq/irqdomain. The whole thing is available at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/gicv3-its
>
> Unless someone screams murder, I consider this to be ready for merge.
>
> M.
>
> From v2 [2]:
> - rebased on top of the stable version of tip/irq/irqdomain
> - use irq_domain_reset_irq_data instead of
> irq_domain_set_hwirq_and_chip on the free path
> - use pci_msi_mask_irq instead of mask_msi_irq
> - use host_data to pass the ITS structure around
> - top-level MSI domain is now indentified by the ITS of_node
>
> From v1 [1]:
> - rebased on top of tip/irq/irqdomain
> - dropped the arm64-specific implementation of arch_setup_msi_irqs and co.
> - reworked the whole ITS/MSI setup to use the new MSI/PCI split
>
> [1]: http://lwn.net/Articles/619788/
> [2]: https://lkml.org/lkml/2014/11/18/825
>
> Marc Zyngier (13):
> arm64: PCI/MSI: Use asm-generic/msi.h
> irqchip: GICv3: Convert to domain hierarchy
> irqchip: GICv3: rework redistributor structure
> irqchip: GICv3: ITS command queue
> irqchip: GICv3: ITS: irqchip implementation
> irqchip: GICv3: ITS: LPI allocator
> irqchip: GICv3: ITS: tables allocators
> irqchip: GICv3: ITS: device allocation and configuration
> irqchip: GICv3: ITS: MSI support
> irqchip: GICv3: ITS: DT probing and initialization
> irqchip: GICv3: ITS: plug ITS init into main GICv3 code
> irqchip: GICv3: ITS: enable compilation of the ITS driver
> irqchip: GICv3: Binding updates for ITS
>
> Documentation/devicetree/bindings/arm/gic-v3.txt | 39 +
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/Kbuild | 1 +
> drivers/irqchip/Kconfig | 5 +
> drivers/irqchip/Makefile | 1 +
> drivers/irqchip/irq-gic-v3-its.c | 1402 ++++++++++++++++++++++
> drivers/irqchip/irq-gic-v3.c | 156 ++-
> include/linux/irqchip/arm-gic-v3.h | 128 ++
> 8 files changed, 1693 insertions(+), 40 deletions(-)
> create mode 100644 drivers/irqchip/irq-gic-v3-its.c

Applied to irqchip/core with a dependency on tip/irq/irqdomain

thx,

Jason.

2014-11-26 10:14:51

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 10/13] irqchip: GICv3: ITS: DT probing and initialization

Hi Stuart,

On 25/11/14 21:08, Stuart Yoder wrote:
> On Mon, Nov 24, 2014 at 8:35 AM, Marc Zyngier <[email protected]> wrote:
>> Add the code that probes the ITS from the device tree,
>> and initialize it.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 169 +++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 169 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 532c6df..e9d1615 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -1231,3 +1231,172 @@ static const struct irq_domain_ops its_domain_ops = {
>> .alloc = its_irq_domain_alloc,
>> .free = its_irq_domain_free,
>> };
>> +
>> +static int its_probe(struct device_node *node, struct irq_domain *parent)
>> +{
>> + struct resource res;
>> + struct its_node *its;
>> + void __iomem *its_base;
>> + u32 val;
>> + u64 baser, tmp;
>> + int err;
>> +
>> + err = of_address_to_resource(node, 0, &res);
>> + if (err) {
>> + pr_warn("%s: no regs?\n", node->full_name);
>> + return -ENXIO;
>> + }
>> +
>> + its_base = ioremap(res.start, resource_size(&res));
>> + if (!its_base) {
>> + pr_warn("%s: unable to map registers\n", node->full_name);
>> + return -ENOMEM;
>> + }
>> +
>> + val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
>> + if (val != 0x30 && val != 0x40) {
>> + pr_warn("%s: no ITS detected, giving up\n", node->full_name);
>> + err = -ENODEV;
>> + goto out_unmap;
>> + }
>> +
>> + pr_info("ITS: %s\n", node->full_name);
>> +
>> + its = kzalloc(sizeof(*its), GFP_KERNEL);
>> + if (!its) {
>> + err = -ENOMEM;
>> + goto out_unmap;
>> + }
>> +
>> + raw_spin_lock_init(&its->lock);
>> + INIT_LIST_HEAD(&its->entry);
>> + INIT_LIST_HEAD(&its->its_device_list);
>> + its->base = its_base;
>> + its->phys_base = res.start;
>> + its->msi_chip.of_node = node;
>> + its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
>> +
>> + its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
>> + if (!its->cmd_base) {
>> + err = -ENOMEM;
>> + goto out_free_its;
>> + }
>> + its->cmd_write = its->cmd_base;
>> +
>> + err = its_alloc_tables(its);
>> + if (err)
>> + goto out_free_cmd;
>> +
>> + err = its_alloc_collections(its);
>> + if (err)
>> + goto out_free_tables;
>> +
>> + baser = (virt_to_phys(its->cmd_base) |
>> + GITS_CBASER_WaWb |
>> + GITS_CBASER_InnerShareable |
>> + (ITS_CMD_QUEUE_SZ / SZ_4K - 1) |
>> + GITS_CBASER_VALID);
>> +
>> + writeq_relaxed(baser, its->base + GITS_CBASER);
>> + tmp = readq_relaxed(its->base + GITS_CBASER);
>> + writeq_relaxed(0, its->base + GITS_CWRITER);
>> + writel_relaxed(1, its->base + GITS_CTLR);
>> +
>> + if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
>> + pr_info("ITS: using cache flushing for cmd queue\n");
>> + its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
>> + }
>> +
>> + if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
>> + its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
>> + if (!its->domain) {
>> + err = -ENOMEM;
>> + goto out_free_tables;
>> + }
>> +
>> + its->domain->parent = parent;
>> +
>> + its->msi_chip.domain = pci_msi_create_irq_domain(node,
>> + &its_pci_msi_domain_info,
>> + its->domain);
>> + if (!its->msi_chip.domain) {
>> + err = -ENOMEM;
>> + goto out_free_domains;
>> + }
>> +
>> + err = of_pci_msi_chip_add(&its->msi_chip);
>> + if (err)
>> + goto out_free_domains;
>> + }
>
> Hi Marc,
>
> We have a requirement to have both PCI and non-PCI buses use the GIC_ITS.
> Above, you have the hardcoded assumption that this is PCI. How do 2 different
> bus types share the ITS at the same time.

This set of patches specifically targets PCI, as this is the only thing
that I can realistically test.

When it comes to non-PCI uses of the ITS, it shouldn't be too hard: just
instantiate a non-PCI MSI domain sitting on top of the same ITS domain.
The split in responsibilities between MSI and ITS domains is designed to
cover exactly this.

This of course assumes that your non-PCI devices behave in a similar way
to PCI devices (programmable event ID, as well as unique, discoverable
device IDs).

Hope this helps,

M.
--
Jazz is not dead. It just smells funny...

2014-12-04 21:52:11

by Stuart Yoder

[permalink] [raw]
Subject: Re: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

On Mon, Nov 24, 2014 at 8:35 AM, Marc Zyngier <[email protected]> wrote:

> +/*
> + * We need a value to serve as a irq-type for LPIs. Choose one that will
> + * hopefully pique the interest of the reviewer.
> + */
> +#define GIC_IRQ_TYPE_LPI 0xa110c8ed

Ok, my interest is piqued. Why this value?

Stuart

2014-12-04 21:58:31

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

On Thu, 4 Dec 2014, Stuart Yoder wrote:
> On Mon, Nov 24, 2014 at 8:35 AM, Marc Zyngier <[email protected]> wrote:
>
> > +/*
> > + * We need a value to serve as a irq-type for LPIs. Choose one that will
> > + * hopefully pique the interest of the reviewer.
> > + */
> > +#define GIC_IRQ_TYPE_LPI 0xa110c8ed
>
> Ok, my interest is piqued. Why this value?

Hint: 0x1337

2014-12-05 10:10:41

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

Hi Stuart,

On 04/12/14 21:52, Stuart Yoder wrote:
> On Mon, Nov 24, 2014 at 8:35 AM, Marc Zyngier <[email protected]> wrote:
>
>> +/*
>> + * We need a value to serve as a irq-type for LPIs. Choose one that will
>> + * hopefully pique the interest of the reviewer.
>> + */
>> +#define GIC_IRQ_TYPE_LPI 0xa110c8ed
>
> Ok, my interest is piqued. Why this value?

See the xlate function in the main GICv3 driver.

We need something to indicate that we want it to allocate an LPI, but
this function is mostly DT-specific, and we don't have an LPI type in
the binding just yet (I've purposely pushed back on it until we have a
clear idea of where we want to take LPIs).

So this value is just a made up thing that is unlikely to ever find its
place in a DT binding.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2014-12-08 03:30:30

by Abel Wu

[permalink] [raw]
Subject: Re: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

Hi Marc,
On 2014/11/24 22:35, Marc Zyngier wrote:

> Now, the bit of code that allow us to use the ITS as a MSI controller.
> Both MSI and MSI-X are supported.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 176 +++++++++++++++++++++++++++++++++++++
> include/linux/irqchip/arm-gic-v3.h | 6 ++
> 2 files changed, 182 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index d687fd4..532c6df 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -587,12 +587,47 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> return IRQ_SET_MASK_OK_DONE;
> }
>
> +static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
> +{
> + struct its_device *its_dev = irq_data_get_irq_chip_data(d);
> + struct its_node *its;
> + u64 addr;
> +
> + its = its_dev->its;
> + addr = its->phys_base + GITS_TRANSLATER;
> +
> + msg->address_lo = addr & ((1UL << 32) - 1);
> + msg->address_hi = addr >> 32;
> + msg->data = its_get_event_id(d);
> +}
> +
> static struct irq_chip its_irq_chip = {
> .name = "ITS",
> .irq_mask = its_mask_irq,
> .irq_unmask = its_unmask_irq,
> .irq_eoi = its_eoi_irq,
> .irq_set_affinity = its_set_affinity,
> + .irq_compose_msi_msg = its_irq_compose_msi_msg,
> +};
> +
> +static void its_mask_msi_irq(struct irq_data *d)
> +{
> + pci_msi_mask_irq(d);
> + irq_chip_mask_parent(d);
> +}
> +
> +static void its_unmask_msi_irq(struct irq_data *d)
> +{
> + pci_msi_unmask_irq(d);
> + irq_chip_unmask_parent(d);
> +}
> +
> +static struct irq_chip its_msi_irq_chip = {
> + .name = "ITS-MSI",
> + .irq_unmask = its_unmask_msi_irq,
> + .irq_mask = its_mask_msi_irq,
> + .irq_eoi = irq_chip_eoi_parent,
> + .irq_write_msi_msg = pci_msi_domain_write_msg,
> };
>
> /*
> @@ -1055,3 +1090,144 @@ static void its_free_device(struct its_device *its_dev)
> kfree(its_dev->itt);
> kfree(its_dev);
> }
> +
> +static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
> +{
> + int idx;
> +
> + idx = find_first_zero_bit(dev->lpi_map, dev->nr_lpis);
> + if (idx == dev->nr_lpis)
> + return -ENOSPC;
> +
> + *hwirq = dev->lpi_base + idx;
> + set_bit(idx, dev->lpi_map);
> +
> + /* Map the GIC irq ID to the device */
> + its_send_mapvi(dev, *hwirq, idx);

It would be better if we do hardware-level initialization in domain.{activate,deactivate}.

> +
> + return 0;
> +}
> +
> +static int its_msi_prepare(struct irq_domain *domain, struct device *dev,
> + int nvec, msi_alloc_info_t *info)
> +{
> + struct pci_dev *pdev;
> + struct its_node *its;
> + u32 dev_id;
> + struct its_device *its_dev;
> +
> + if (!dev_is_pci(dev))
> + return -EINVAL;
> +
> + pdev = to_pci_dev(dev);
> + dev_id = PCI_DEVID(pdev->bus->number, pdev->devfn);
> + its = domain->parent->host_data;
> +
> + its_dev = its_find_device(its, dev_id);
> + if (WARN_ON(its_dev))
> + return -EINVAL;
> +
> + its_dev = its_create_device(its, dev_id, nvec);
> + if (!its_dev)
> + return -ENOMEM;
> +
> + dev_dbg(&pdev->dev, "ITT %d entries, %d bits\n", nvec, ilog2(nvec));
> +
> + info->scratchpad[0].ptr = its_dev;
> + info->scratchpad[1].ptr = dev;
> + return 0;
> +}
> +
> +static struct msi_domain_ops its_pci_msi_ops = {
> + .msi_prepare = its_msi_prepare,
> +};
> +
> +static struct msi_domain_info its_pci_msi_domain_info = {
> + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX),
> + .ops = &its_pci_msi_ops,
> + .chip = &its_msi_irq_chip,
> +};
> +
> +static int its_irq_gic_domain_alloc(struct irq_domain *domain,
> + unsigned int virq,
> + irq_hw_number_t hwirq)
> +{
> + struct of_phandle_args args;
> +
> + args.np = domain->parent->of_node;
> + args.args_count = 3;
> + args.args[0] = GIC_IRQ_TYPE_LPI;
> + args.args[1] = hwirq;
> + args.args[2] = IRQ_TYPE_EDGE_RISING;
> +
> + return irq_domain_alloc_irqs_parent(domain, virq, 1, &args);
> +}
> +
> +static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
> + unsigned int nr_irqs, void *args)
> +{
> + msi_alloc_info_t *info = args;
> + struct its_device *its_dev = info->scratchpad[0].ptr;
> + irq_hw_number_t hwirq;
> + int err;
> + int i;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + err = its_alloc_device_irq(its_dev, &hwirq);
> + if (err)
> + return err;
> +
> + err = its_irq_gic_domain_alloc(domain, virq + i, hwirq);
> + if (err)
> + return err;
> +
> + irq_domain_set_hwirq_and_chip(domain, virq + i,
> + hwirq, &its_irq_chip, its_dev);
> + dev_dbg(info->scratchpad[1].ptr, "ID:%d pID:%d vID:%d\n",
> + (int)(hwirq - its_dev->lpi_base), (int)hwirq, virq + i);
> + }
> +
> + return 0;
> +}
> +
> +static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
> + unsigned int nr_irqs)
> +{
> + struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> + struct its_device *its_dev = irq_data_get_irq_chip_data(d);
> + int i;
> +
> + for (i = 0; i < nr_irqs; i++) {
> + struct irq_data *data = irq_domain_get_irq_data(domain,
> + virq + i);
> + int event = its_get_event_id(data);
> +
> + /* Stop the delivery of interrupts */
> + its_send_discard(its_dev, event);
> +
> + /* Mark interrupt index as unused */
> + clear_bit(event, its_dev->lpi_map);
> +
> + /* Nuke the entry in the domain */
> + irq_domain_reset_irq_data(d);

I think you mean "data" here, instead of "d"?

Regards,
Abel

> + }
> +
> + /* If all interrupts have been freed, start mopping the floor */
> + if (bitmap_empty(its_dev->lpi_map, its_dev->nr_lpis)) {
> + its_lpi_free(its_dev->lpi_map,
> + its_dev->lpi_base,
> + its_dev->nr_lpis);
> +
> + /* Unmap device/itt */
> + its_send_mapd(its_dev, 0);
> + its_free_device(its_dev);
> + }
> +
> + irq_domain_free_irqs_parent(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops its_domain_ops = {
> + .alloc = its_irq_domain_alloc,
> + .free = its_irq_domain_free,
> +};
> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> index 21c9d70..0ed30d7 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -295,6 +295,12 @@
>
> #include <linux/stringify.h>
>
> +/*
> + * We need a value to serve as a irq-type for LPIs. Choose one that will
> + * hopefully pique the interest of the reviewer.
> + */
> +#define GIC_IRQ_TYPE_LPI 0xa110c8ed
> +
> struct rdists {
> struct {
> void __iomem *rd_base;


2014-12-08 09:32:36

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 09/13] irqchip: GICv3: ITS: MSI support

On 08/12/14 03:28, Yun Wu (Abel) wrote:
> Hi Marc,
> On 2014/11/24 22:35, Marc Zyngier wrote:
>
>> Now, the bit of code that allow us to use the ITS as a MSI controller.
>> Both MSI and MSI-X are supported.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 176 +++++++++++++++++++++++++++++++++++++
>> include/linux/irqchip/arm-gic-v3.h | 6 ++
>> 2 files changed, 182 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index d687fd4..532c6df 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -587,12 +587,47 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
>> return IRQ_SET_MASK_OK_DONE;
>> }
>>
>> +static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
>> +{
>> + struct its_device *its_dev = irq_data_get_irq_chip_data(d);
>> + struct its_node *its;
>> + u64 addr;
>> +
>> + its = its_dev->its;
>> + addr = its->phys_base + GITS_TRANSLATER;
>> +
>> + msg->address_lo = addr & ((1UL << 32) - 1);
>> + msg->address_hi = addr >> 32;
>> + msg->data = its_get_event_id(d);
>> +}
>> +
>> static struct irq_chip its_irq_chip = {
>> .name = "ITS",
>> .irq_mask = its_mask_irq,
>> .irq_unmask = its_unmask_irq,
>> .irq_eoi = its_eoi_irq,
>> .irq_set_affinity = its_set_affinity,
>> + .irq_compose_msi_msg = its_irq_compose_msi_msg,
>> +};
>> +
>> +static void its_mask_msi_irq(struct irq_data *d)
>> +{
>> + pci_msi_mask_irq(d);
>> + irq_chip_mask_parent(d);
>> +}
>> +
>> +static void its_unmask_msi_irq(struct irq_data *d)
>> +{
>> + pci_msi_unmask_irq(d);
>> + irq_chip_unmask_parent(d);
>> +}
>> +
>> +static struct irq_chip its_msi_irq_chip = {
>> + .name = "ITS-MSI",
>> + .irq_unmask = its_unmask_msi_irq,
>> + .irq_mask = its_mask_msi_irq,
>> + .irq_eoi = irq_chip_eoi_parent,
>> + .irq_write_msi_msg = pci_msi_domain_write_msg,
>> };
>>
>> /*
>> @@ -1055,3 +1090,144 @@ static void its_free_device(struct its_device *its_dev)
>> kfree(its_dev->itt);
>> kfree(its_dev);
>> }
>> +
>> +static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
>> +{
>> + int idx;
>> +
>> + idx = find_first_zero_bit(dev->lpi_map, dev->nr_lpis);
>> + if (idx == dev->nr_lpis)
>> + return -ENOSPC;
>> +
>> + *hwirq = dev->lpi_base + idx;
>> + set_bit(idx, dev->lpi_map);
>> +
>> + /* Map the GIC irq ID to the device */
>> + its_send_mapvi(dev, *hwirq, idx);
>
> It would be better if we do hardware-level initialization in domain.{activate,deactivate}.

That'd certainly be possible. I'll have a look.

>> +
>> + return 0;
>> +}
>> +
>> +static int its_msi_prepare(struct irq_domain *domain, struct device *dev,
>> + int nvec, msi_alloc_info_t *info)
>> +{
>> + struct pci_dev *pdev;
>> + struct its_node *its;
>> + u32 dev_id;
>> + struct its_device *its_dev;
>> +
>> + if (!dev_is_pci(dev))
>> + return -EINVAL;
>> +
>> + pdev = to_pci_dev(dev);
>> + dev_id = PCI_DEVID(pdev->bus->number, pdev->devfn);
>> + its = domain->parent->host_data;
>> +
>> + its_dev = its_find_device(its, dev_id);
>> + if (WARN_ON(its_dev))
>> + return -EINVAL;
>> +
>> + its_dev = its_create_device(its, dev_id, nvec);
>> + if (!its_dev)
>> + return -ENOMEM;
>> +
>> + dev_dbg(&pdev->dev, "ITT %d entries, %d bits\n", nvec, ilog2(nvec));
>> +
>> + info->scratchpad[0].ptr = its_dev;
>> + info->scratchpad[1].ptr = dev;
>> + return 0;
>> +}
>> +
>> +static struct msi_domain_ops its_pci_msi_ops = {
>> + .msi_prepare = its_msi_prepare,
>> +};
>> +
>> +static struct msi_domain_info its_pci_msi_domain_info = {
>> + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
>> + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX),
>> + .ops = &its_pci_msi_ops,
>> + .chip = &its_msi_irq_chip,
>> +};
>> +
>> +static int its_irq_gic_domain_alloc(struct irq_domain *domain,
>> + unsigned int virq,
>> + irq_hw_number_t hwirq)
>> +{
>> + struct of_phandle_args args;
>> +
>> + args.np = domain->parent->of_node;
>> + args.args_count = 3;
>> + args.args[0] = GIC_IRQ_TYPE_LPI;
>> + args.args[1] = hwirq;
>> + args.args[2] = IRQ_TYPE_EDGE_RISING;
>> +
>> + return irq_domain_alloc_irqs_parent(domain, virq, 1, &args);
>> +}
>> +
>> +static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
>> + unsigned int nr_irqs, void *args)
>> +{
>> + msi_alloc_info_t *info = args;
>> + struct its_device *its_dev = info->scratchpad[0].ptr;
>> + irq_hw_number_t hwirq;
>> + int err;
>> + int i;
>> +
>> + for (i = 0; i < nr_irqs; i++) {
>> + err = its_alloc_device_irq(its_dev, &hwirq);
>> + if (err)
>> + return err;
>> +
>> + err = its_irq_gic_domain_alloc(domain, virq + i, hwirq);
>> + if (err)
>> + return err;
>> +
>> + irq_domain_set_hwirq_and_chip(domain, virq + i,
>> + hwirq, &its_irq_chip, its_dev);
>> + dev_dbg(info->scratchpad[1].ptr, "ID:%d pID:%d vID:%d\n",
>> + (int)(hwirq - its_dev->lpi_base), (int)hwirq, virq + i);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
>> + unsigned int nr_irqs)
>> +{
>> + struct irq_data *d = irq_domain_get_irq_data(domain, virq);
>> + struct its_device *its_dev = irq_data_get_irq_chip_data(d);
>> + int i;
>> +
>> + for (i = 0; i < nr_irqs; i++) {
>> + struct irq_data *data = irq_domain_get_irq_data(domain,
>> + virq + i);
>> + int event = its_get_event_id(data);
>> +
>> + /* Stop the delivery of interrupts */
>> + its_send_discard(its_dev, event);
>> +
>> + /* Mark interrupt index as unused */
>> + clear_bit(event, its_dev->lpi_map);
>> +
>> + /* Nuke the entry in the domain */
>> + irq_domain_reset_irq_data(d);
>
> I think you mean "data" here, instead of "d"?

Indeed, nice catch.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2014-12-10 03:07:52

by Abel Wu

[permalink] [raw]
Subject: Re: [PATCH v3 04/13] irqchip: GICv3: ITS command queue

On 2014/11/24 22:35, Marc Zyngier wrote:

[...]

> +static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + unsigned long itt_addr;
> + u8 size = order_base_2(desc->its_mapd_cmd.dev->nr_ites);

If @nr_ites is 1, then @size becomes 0, and... (see below)

> +
> + itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
> + itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
> +
> + its_encode_cmd(cmd, GITS_CMD_MAPD);
> + its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
> + its_encode_size(cmd, size - 1);

here (size - 1) becomes the value of ~0, which will exceed the maximum
supported bits of identifier.

Regards,
Abel

> + its_encode_itt(cmd, itt_addr);
> + its_encode_valid(cmd, desc->its_mapd_cmd.valid);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_mapd_cmd.dev->collection;
> +}
> +
> +static struct its_collection *its_build_mapc_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_MAPC);
> + its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
> + its_encode_target(cmd, desc->its_mapc_cmd.col->target_address);
> + its_encode_valid(cmd, desc->its_mapc_cmd.valid);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_mapc_cmd.col;
> +}
> +
> +static struct its_collection *its_build_mapvi_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_MAPVI);
> + its_encode_devid(cmd, desc->its_mapvi_cmd.dev->device_id);
> + its_encode_event_id(cmd, desc->its_mapvi_cmd.event_id);
> + its_encode_phys_id(cmd, desc->its_mapvi_cmd.phys_id);
> + its_encode_collection(cmd, desc->its_mapvi_cmd.dev->collection->col_id);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_mapvi_cmd.dev->collection;
> +}
> +
> +static struct its_collection *its_build_movi_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_MOVI);
> + its_encode_devid(cmd, desc->its_movi_cmd.dev->device_id);
> + its_encode_event_id(cmd, desc->its_movi_cmd.id);
> + its_encode_collection(cmd, desc->its_movi_cmd.col->col_id);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_movi_cmd.dev->collection;
> +}
> +
> +static struct its_collection *its_build_discard_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_DISCARD);
> + its_encode_devid(cmd, desc->its_discard_cmd.dev->device_id);
> + its_encode_event_id(cmd, desc->its_discard_cmd.event_id);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_discard_cmd.dev->collection;
> +}
> +
> +static struct its_collection *its_build_inv_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_INV);
> + its_encode_devid(cmd, desc->its_inv_cmd.dev->device_id);
> + its_encode_event_id(cmd, desc->its_inv_cmd.event_id);
> +
> + its_fixup_cmd(cmd);
> +
> + return desc->its_inv_cmd.dev->collection;
> +}
> +
> +static struct its_collection *its_build_invall_cmd(struct its_cmd_block *cmd,
> + struct its_cmd_desc *desc)
> +{
> + its_encode_cmd(cmd, GITS_CMD_INVALL);
> + its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
> +
> + its_fixup_cmd(cmd);
> +
> + return NULL;
> +}
> +
> +static u64 its_cmd_ptr_to_offset(struct its_node *its,
> + struct its_cmd_block *ptr)
> +{
> + return (ptr - its->cmd_base) * sizeof(*ptr);
> +}
> +
> +static int its_queue_full(struct its_node *its)
> +{
> + int widx;
> + int ridx;
> +
> + widx = its->cmd_write - its->cmd_base;
> + ridx = readl_relaxed(its->base + GITS_CREADR) / sizeof(struct its_cmd_block);
> +
> + /* This is incredibly unlikely to happen, unless the ITS locks up. */
> + if (((widx + 1) % ITS_CMD_QUEUE_NR_ENTRIES) == ridx)
> + return 1;
> +
> + return 0;
> +}
> +
> +static struct its_cmd_block *its_allocate_entry(struct its_node *its)
> +{
> + struct its_cmd_block *cmd;
> + u32 count = 1000000; /* 1s! */
> +
> + while (its_queue_full(its)) {
> + count--;
> + if (!count) {
> + pr_err_ratelimited("ITS queue not draining\n");
> + return NULL;
> + }
> + cpu_relax();
> + udelay(1);
> + }
> +
> + cmd = its->cmd_write++;
> +
> + /* Handle queue wrapping */
> + if (its->cmd_write == (its->cmd_base + ITS_CMD_QUEUE_NR_ENTRIES))
> + its->cmd_write = its->cmd_base;
> +
> + return cmd;
> +}
> +
> +static struct its_cmd_block *its_post_commands(struct its_node *its)
> +{
> + u64 wr = its_cmd_ptr_to_offset(its, its->cmd_write);
> +
> + writel_relaxed(wr, its->base + GITS_CWRITER);
> +
> + return its->cmd_write;
> +}
> +
> +static void its_flush_cmd(struct its_node *its, struct its_cmd_block *cmd)
> +{
> + /*
> + * Make sure the commands written to memory are observable by
> + * the ITS.
> + */
> + if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
> + __flush_dcache_area(cmd, sizeof(*cmd));
> + else
> + dsb(ishst);
> +}
> +
> +static void its_wait_for_range_completion(struct its_node *its,
> + struct its_cmd_block *from,
> + struct its_cmd_block *to)
> +{
> + u64 rd_idx, from_idx, to_idx;
> + u32 count = 1000000; /* 1s! */
> +
> + from_idx = its_cmd_ptr_to_offset(its, from);
> + to_idx = its_cmd_ptr_to_offset(its, to);
> +
> + while (1) {
> + rd_idx = readl_relaxed(its->base + GITS_CREADR);
> + if (rd_idx >= to_idx || rd_idx < from_idx)
> + break;
> +
> + count--;
> + if (!count) {
> + pr_err_ratelimited("ITS queue timeout\n");
> + return;
> + }
> + cpu_relax();
> + udelay(1);
> + }
> +}
> +
> +static void its_send_single_command(struct its_node *its,
> + its_cmd_builder_t builder,
> + struct its_cmd_desc *desc)
> +{
> + struct its_cmd_block *cmd, *sync_cmd, *next_cmd;
> + struct its_collection *sync_col;
> +
> + raw_spin_lock(&its->lock);
> +
> + cmd = its_allocate_entry(its);
> + if (!cmd) { /* We're soooooo screewed... */
> + pr_err_ratelimited("ITS can't allocate, dropping command\n");
> + raw_spin_unlock(&its->lock);
> + return;
> + }
> + sync_col = builder(cmd, desc);
> + its_flush_cmd(its, cmd);
> +
> + if (sync_col) {
> + sync_cmd = its_allocate_entry(its);
> + if (!sync_cmd) {
> + pr_err_ratelimited("ITS can't SYNC, skipping\n");
> + goto post;
> + }
> + its_encode_cmd(sync_cmd, GITS_CMD_SYNC);
> + its_encode_target(sync_cmd, sync_col->target_address);
> + its_fixup_cmd(sync_cmd);
> + its_flush_cmd(its, sync_cmd);
> + }
> +
> +post:
> + next_cmd = its_post_commands(its);
> + raw_spin_unlock(&its->lock);
> +
> + its_wait_for_range_completion(its, cmd, next_cmd);
> +}
> +
> +static void its_send_inv(struct its_device *dev, u32 event_id)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_inv_cmd.dev = dev;
> + desc.its_inv_cmd.event_id = event_id;
> +
> + its_send_single_command(dev->its, its_build_inv_cmd, &desc);
> +}
> +
> +static void its_send_mapd(struct its_device *dev, int valid)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_mapd_cmd.dev = dev;
> + desc.its_mapd_cmd.valid = !!valid;
> +
> + its_send_single_command(dev->its, its_build_mapd_cmd, &desc);
> +}
> +
> +static void its_send_mapc(struct its_node *its, struct its_collection *col,
> + int valid)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_mapc_cmd.col = col;
> + desc.its_mapc_cmd.valid = !!valid;
> +
> + its_send_single_command(its, its_build_mapc_cmd, &desc);
> +}
> +
> +static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_mapvi_cmd.dev = dev;
> + desc.its_mapvi_cmd.phys_id = irq_id;
> + desc.its_mapvi_cmd.event_id = id;
> +
> + its_send_single_command(dev->its, its_build_mapvi_cmd, &desc);
> +}
> +
> +static void its_send_movi(struct its_device *dev,
> + struct its_collection *col, u32 id)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_movi_cmd.dev = dev;
> + desc.its_movi_cmd.col = col;
> + desc.its_movi_cmd.id = id;
> +
> + its_send_single_command(dev->its, its_build_movi_cmd, &desc);
> +}
> +
> +static void its_send_discard(struct its_device *dev, u32 id)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_discard_cmd.dev = dev;
> + desc.its_discard_cmd.event_id = id;
> +
> + its_send_single_command(dev->its, its_build_discard_cmd, &desc);
> +}
> +
> +static void its_send_invall(struct its_node *its, struct its_collection *col)
> +{
> + struct its_cmd_desc desc;
> +
> + desc.its_invall_cmd.col = col;
> +
> + its_send_single_command(its, its_build_invall_cmd, &desc);
> +}
> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> index 040615a..21c9d70 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -80,9 +80,27 @@
> #define GICR_MOVALLR 0x0110
> #define GICR_PIDR2 GICD_PIDR2
>
> +#define GICR_CTLR_ENABLE_LPIS (1UL << 0)
> +
> +#define GICR_TYPER_CPU_NUMBER(r) (((r) >> 8) & 0xffff)
> +
> #define GICR_WAKER_ProcessorSleep (1U << 1)
> #define GICR_WAKER_ChildrenAsleep (1U << 2)
>
> +#define GICR_PROPBASER_NonShareable (0U << 10)
> +#define GICR_PROPBASER_InnerShareable (1U << 10)
> +#define GICR_PROPBASER_OuterShareable (2U << 10)
> +#define GICR_PROPBASER_SHAREABILITY_MASK (3UL << 10)
> +#define GICR_PROPBASER_nCnB (0U << 7)
> +#define GICR_PROPBASER_nC (1U << 7)
> +#define GICR_PROPBASER_RaWt (2U << 7)
> +#define GICR_PROPBASER_RaWb (3U << 7)
> +#define GICR_PROPBASER_WaWt (4U << 7)
> +#define GICR_PROPBASER_WaWb (5U << 7)
> +#define GICR_PROPBASER_RaWaWt (6U << 7)
> +#define GICR_PROPBASER_RaWaWb (7U << 7)
> +#define GICR_PROPBASER_IDBITS_MASK (0x1f)
> +
> /*
> * Re-Distributor registers, offsets from SGI_base
> */
> @@ -95,9 +113,93 @@
> #define GICR_IPRIORITYR0 GICD_IPRIORITYR
> #define GICR_ICFGR0 GICD_ICFGR
>
> +#define GICR_TYPER_PLPIS (1U << 0)
> #define GICR_TYPER_VLPIS (1U << 1)
> #define GICR_TYPER_LAST (1U << 4)
>
> +#define LPI_PROP_GROUP1 (1 << 1)
> +#define LPI_PROP_ENABLED (1 << 0)
> +
> +/*
> + * ITS registers, offsets from ITS_base
> + */
> +#define GITS_CTLR 0x0000
> +#define GITS_IIDR 0x0004
> +#define GITS_TYPER 0x0008
> +#define GITS_CBASER 0x0080
> +#define GITS_CWRITER 0x0088
> +#define GITS_CREADR 0x0090
> +#define GITS_BASER 0x0100
> +#define GITS_PIDR2 GICR_PIDR2
> +
> +#define GITS_TRANSLATER 0x10040
> +
> +#define GITS_TYPER_PTA (1UL << 19)
> +
> +#define GITS_CBASER_VALID (1UL << 63)
> +#define GITS_CBASER_nCnB (0UL << 59)
> +#define GITS_CBASER_nC (1UL << 59)
> +#define GITS_CBASER_RaWt (2UL << 59)
> +#define GITS_CBASER_RaWb (3UL << 59)
> +#define GITS_CBASER_WaWt (4UL << 59)
> +#define GITS_CBASER_WaWb (5UL << 59)
> +#define GITS_CBASER_RaWaWt (6UL << 59)
> +#define GITS_CBASER_RaWaWb (7UL << 59)
> +#define GITS_CBASER_NonShareable (0UL << 10)
> +#define GITS_CBASER_InnerShareable (1UL << 10)
> +#define GITS_CBASER_OuterShareable (2UL << 10)
> +#define GITS_CBASER_SHAREABILITY_MASK (3UL << 10)
> +
> +#define GITS_BASER_NR_REGS 8
> +
> +#define GITS_BASER_VALID (1UL << 63)
> +#define GITS_BASER_nCnB (0UL << 59)
> +#define GITS_BASER_nC (1UL << 59)
> +#define GITS_BASER_RaWt (2UL << 59)
> +#define GITS_BASER_RaWb (3UL << 59)
> +#define GITS_BASER_WaWt (4UL << 59)
> +#define GITS_BASER_WaWb (5UL << 59)
> +#define GITS_BASER_RaWaWt (6UL << 59)
> +#define GITS_BASER_RaWaWb (7UL << 59)
> +#define GITS_BASER_TYPE_SHIFT (56)
> +#define GITS_BASER_TYPE(r) (((r) >> GITS_BASER_TYPE_SHIFT) & 7)
> +#define GITS_BASER_ENTRY_SIZE_SHIFT (48)
> +#define GITS_BASER_ENTRY_SIZE(r) ((((r) >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0xff) + 1)
> +#define GITS_BASER_NonShareable (0UL << 10)
> +#define GITS_BASER_InnerShareable (1UL << 10)
> +#define GITS_BASER_OuterShareable (2UL << 10)
> +#define GITS_BASER_SHAREABILITY_SHIFT (10)
> +#define GITS_BASER_SHAREABILITY_MASK (3UL << GITS_BASER_SHAREABILITY_SHIFT)
> +#define GITS_BASER_PAGE_SIZE_SHIFT (8)
> +#define GITS_BASER_PAGE_SIZE_4K (0UL << GITS_BASER_PAGE_SIZE_SHIFT)
> +#define GITS_BASER_PAGE_SIZE_16K (1UL << GITS_BASER_PAGE_SIZE_SHIFT)
> +#define GITS_BASER_PAGE_SIZE_64K (2UL << GITS_BASER_PAGE_SIZE_SHIFT)
> +#define GITS_BASER_PAGE_SIZE_MASK (3UL << GITS_BASER_PAGE_SIZE_SHIFT)
> +
> +#define GITS_BASER_TYPE_NONE 0
> +#define GITS_BASER_TYPE_DEVICE 1
> +#define GITS_BASER_TYPE_VCPU 2
> +#define GITS_BASER_TYPE_CPU 3
> +#define GITS_BASER_TYPE_COLLECTION 4
> +#define GITS_BASER_TYPE_RESERVED5 5
> +#define GITS_BASER_TYPE_RESERVED6 6
> +#define GITS_BASER_TYPE_RESERVED7 7
> +
> +/*
> + * ITS commands
> + */
> +#define GITS_CMD_MAPD 0x08
> +#define GITS_CMD_MAPC 0x09
> +#define GITS_CMD_MAPVI 0x0a
> +#define GITS_CMD_MOVI 0x01
> +#define GITS_CMD_DISCARD 0x0f
> +#define GITS_CMD_INV 0x0c
> +#define GITS_CMD_MOVALL 0x0e
> +#define GITS_CMD_INVALL 0x0d
> +#define GITS_CMD_INT 0x03
> +#define GITS_CMD_CLEAR 0x04
> +#define GITS_CMD_SYNC 0x05
> +
> /*
> * CPU interface registers
> */


2014-12-10 11:20:41

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 04/13] irqchip: GICv3: ITS command queue

On 10/12/14 03:03, Yun Wu (Abel) wrote:
> On 2014/11/24 22:35, Marc Zyngier wrote:
>
> [...]
>
>> +static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>> + struct its_cmd_desc *desc)
>> +{
>> + unsigned long itt_addr;
>> + u8 size = order_base_2(desc->its_mapd_cmd.dev->nr_ites);
>
> If @nr_ites is 1, then @size becomes 0, and... (see below)
>
>> +
>> + itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>> + itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>> +
>> + its_encode_cmd(cmd, GITS_CMD_MAPD);
>> + its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>> + its_encode_size(cmd, size - 1);
>
> here (size - 1) becomes the value of ~0, which will exceed the maximum
> supported bits of identifier.

Well, the problem is that nr_ites should never be 1, as it effectively
means "don't use any bit to index the ITE". And it also means we cannot
have an ITT that's not a strict power of two.

So while this is indeed a bug, the root of the problem is elsewhere.

I'll cook a fix, thanks for the report.

M.
--
Jazz is not dead. It just smells funny...