2016-12-01 07:46:42

by majun (Euler7)

[permalink] [raw]
Subject: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route

From: MaJun <[email protected]>

For current ITS driver, two level table (indirect route) is enabled when the memory used
for LPI route table over the limit(64KB * 2) size. But this function impact the
performance of LPI interrupt actually because need more time to look up the table.

Although this function can save the memory needed, we'd better let the user
to decide enable or disable this function.

MaJun (3):
Binding: Add a new property string in ITS node to control the two-level route function
irqchip/gicv3-its:irqchip/gicv3-its: add a new flag to control indirect route in DT mode
irqchip/gicv3-its:irqchip/gicv3-its: Add a new flag to control indirect route function in ACPI mode.

.../bindings/interrupt-controller/arm,gic-v3.txt | 3 +++
drivers/irqchip/irq-gic-v3-its.c | 19 ++++++++++++++-----
include/acpi/actbl1.h | 3 ++-
3 files changed, 19 insertions(+), 6 deletions(-)

--
1.7.12.4



2016-12-01 07:46:36

by majun (Euler7)

[permalink] [raw]
Subject: [RFC PATCH 1/3]Binding: Add a new property string in ITS node to control the two-level route function

From: MaJun <[email protected]>

Add the two-level-route property in ITS node.
When this property string defined, two-level route(indirect) function
will be enabled in ITS driver, otherwise disable it.

Signed-off-by: MaJun <[email protected]>
---
Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt | 3 +++
1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
index 4c29cda..e9f4a9c 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
@@ -74,6 +74,8 @@ These nodes must have the following properties:
which will generate the MSI.
- reg: Specifies the base physical address and size of the ITS
registers.
+- two-level-route: This is an optional property which means enable the two level
+ route property when look up route table.

The main GIC node must contain the appropriate #address-cells,
#size-cells and ranges properties for the reg property of all ITS
@@ -97,6 +99,7 @@ Examples:

gic-its@2c200000 {
compatible = "arm,gic-v3-its";
+ two-level-route;
msi-controller;
#msi-cells = <1>;
reg = <0x0 0x2c200000 0 0x200000>;
--
1.7.12.4


2016-12-01 07:46:39

by majun (Euler7)

[permalink] [raw]
Subject: [RFC PATCH 2/3] irqchip/gicv3-its: add a new flag to control indirect route in DT mode

From: MaJun <[email protected]>

Add a new flag for ITS node in DT mode so we can disable/enable the
indirect route function.

Signed-off-by: MaJun <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d278425..ee54133 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -46,6 +46,7 @@
#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0)
#define ITS_FLAGS_WORKAROUND_CAVIUM_22375 (1ULL << 1)
#define ITS_FLAGS_WORKAROUND_CAVIUM_23144 (1ULL << 2)
+#define ITS_FLAGS_INDIRECT_ROUTE (1ULL << 3)

#define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING (1 << 0)

@@ -967,7 +968,7 @@ static bool its_parse_baser_device(struct its_node *its, struct its_baser *baser
bool indirect = false;

/* No need to enable Indirection if memory requirement < (psz*2)bytes */
- if ((esz << ids) > (psz * 2)) {
+ if ((its->flags & ITS_FLAGS_INDIRECT_ROUTE) && ((esz << ids) > (psz * 2))) {
/*
* Find out whether hw supports a single or two-level table by
* table by reading bit at offset '62' after writing '1' to it.
@@ -1673,8 +1674,8 @@ static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)
return 0;
}

-static int __init its_probe_one(struct resource *res,
- struct fwnode_handle *handle, int numa_node)
+static int __init its_probe_one(struct resource *res, struct fwnode_handle *handle,
+ int numa_node, u8 flags)
{
struct its_node *its;
void __iomem *its_base;
@@ -1716,6 +1717,7 @@ static int __init its_probe_one(struct resource *res,
its->phys_base = res->start;
its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
its->numa_node = numa_node;
+ its->flags |= flags;

its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
if (!its->cmd_base) {
@@ -1812,6 +1814,7 @@ static int __init its_of_probe(struct device_node *node)
{
struct device_node *np;
struct resource res;
+ u8 flags = 0;

for (np = of_find_matching_node(node, its_device_id); np;
np = of_find_matching_node(np, its_device_id)) {
@@ -1826,7 +1829,10 @@ static int __init its_of_probe(struct device_node *node)
continue;
}

- its_probe_one(&res, &np->fwnode, of_node_to_nid(np));
+ if (of_property_read_bool(np, "two-level-route"))
+ flags |= ITS_FLAGS_INDIRECT_ROUTE;
+
+ its_probe_one(&res, &np->fwnode, of_node_to_nid(np), flags);
}
return 0;
}
@@ -1863,7 +1869,7 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
goto dom_err;
}

- err = its_probe_one(&res, dom_handle, NUMA_NO_NODE);
+ err = its_probe_one(&res, dom_handle, NUMA_NO_NODE, 0);
if (!err)
return 0;

--
1.7.12.4


2016-12-01 07:46:58

by majun (Euler7)

[permalink] [raw]
Subject: [RFC PATCH 3/3]irqchip/gicv3-its: Add a new flag to control indirect route in ACPI mode.

From: MaJun <[email protected]>

Add a new flag to control indirect route function for ACPI mode.
To carry the user defined flags information, we used the reserved byte
in ITS MADT table

Signed-off-by: MaJun <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 5 ++++-
include/acpi/actbl1.h | 3 ++-
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ee54133..4420283 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1848,6 +1848,7 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
struct fwnode_handle *dom_handle;
struct resource res;
int err;
+ u8 flags = 0;

its_entry = (struct acpi_madt_generic_translator *)header;
memset(&res, 0, sizeof(res));
@@ -1855,6 +1856,8 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
res.end = its_entry->base_address + ACPI_GICV3_ITS_MEM_SIZE - 1;
res.flags = IORESOURCE_MEM;

+ flags = its_entry->flags;
+
dom_handle = irq_domain_alloc_fwnode((void *)its_entry->base_address);
if (!dom_handle) {
pr_err("ITS@%pa: Unable to allocate GICv3 ITS domain token\n",
@@ -1869,7 +1872,7 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
goto dom_err;
}

- err = its_probe_one(&res, dom_handle, NUMA_NO_NODE, 0);
+ err = its_probe_one(&res, dom_handle, NUMA_NO_NODE, flags);
if (!err)
return 0;

diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
index 796d6ba..42a08ae 100644
--- a/include/acpi/actbl1.h
+++ b/include/acpi/actbl1.h
@@ -930,7 +930,8 @@ struct acpi_madt_generic_translator {
u16 reserved; /* reserved - must be zero */
u32 translation_id;
u64 base_address;
- u32 reserved2;
+ u8 flags;
+ u8 reserved2[3];
};

/*
--
1.7.12.4


2016-12-01 09:07:22

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route

On 01/12/16 07:45, Majun wrote:
> From: MaJun <[email protected]>
>
> For current ITS driver, two level table (indirect route) is enabled when the memory used
> for LPI route table over the limit(64KB * 2) size. But this function impact the
> performance of LPI interrupt actually because need more time to look up the table.

Are you implying that your ITS doesn't have a cache to lookup the most
active devices, hence performing a full lookup on each interrupt?

Anyway, doing this as a DT quirk doesn't feel right. Please use the ITS
quirk infrastructure.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2016-12-02 09:30:24

by majun (Euler7)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route



在 2016/12/1 17:07, Marc Zyngier 写道:
> On 01/12/16 07:45, Majun wrote:
>> From: MaJun <[email protected]>
>>
>> For current ITS driver, two level table (indirect route) is enabled when the memory used
>> for LPI route table over the limit(64KB * 2) size. But this function impact the
>> performance of LPI interrupt actually because need more time to look up the table.
>
> Are you implying that your ITS doesn't have a cache to lookup the most
> active devices, hence performing a full lookup on each interrupt?

Our ITS chip has the cache with depth 64. But this seems not enough for some
scenario,espeically on virtulization platform.
>
> Anyway, doing this as a DT quirk doesn't feel right. Please use the ITS
> quirk infrastructure.

If there is no other better solutions, I will do this.

Thanks!
Majun



>
> Thanks,
>
> M.
>

2016-12-02 09:35:17

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route

On 02/12/16 09:29, majun (Euler7) wrote:
>
>
> 在 2016/12/1 17:07, Marc Zyngier 写道:
>> On 01/12/16 07:45, Majun wrote:
>>> From: MaJun <[email protected]>
>>>
>>> For current ITS driver, two level table (indirect route) is enabled when the memory used
>>> for LPI route table over the limit(64KB * 2) size. But this function impact the
>>> performance of LPI interrupt actually because need more time to look up the table.
>>
>> Are you implying that your ITS doesn't have a cache to lookup the most
>> active devices, hence performing a full lookup on each interrupt?
>
> Our ITS chip has the cache with depth 64. But this seems not enough for some
> scenario,espeically on virtulization platform.

Then I don't see how switching to to flat tables is going to improve
things. Can you share actual performance numbers?

>> Anyway, doing this as a DT quirk doesn't feel right. Please use the ITS
>> quirk infrastructure.
>
> If there is no other better solutions, I will do this.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2016-12-05 03:11:55

by majun (Euler7)

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route

Hi Marc:

在 2016/12/2 17:35, Marc Zyngier 写道:
> On 02/12/16 09:29, majun (Euler7) wrote:
>>
>>
>> 在 2016/12/1 17:07, Marc Zyngier 写道:
>>> On 01/12/16 07:45, Majun wrote:
>>>> From: MaJun <[email protected]>
>>>>
>>>> For current ITS driver, two level table (indirect route) is enabled when the memory used
>>>> for LPI route table over the limit(64KB * 2) size. But this function impact the
>>>> performance of LPI interrupt actually because need more time to look up the table.
>>>
>>> Are you implying that your ITS doesn't have a cache to lookup the most
>>> active devices, hence performing a full lookup on each interrupt?
>>
>> Our ITS chip has the cache with depth 64. But this seems not enough for some
>> scenario,espeically on virtulization platform.
>
> Then I don't see how switching to to flat tables is going to improve
> things. Can you share actual performance numbers?
>
Sorry, I run this code on EMU and have no actual performance numbers now.

Suppose there are 66 devices in system.
As far as our chip concerned, there are always 2 devices can't benefit from
cache fully when they report the interrupt.

If i'm wrong, please correct me.

Thanks
Majun

>>> Anyway, doing this as a DT quirk doesn't feel right. Please use the ITS
>>> quirk infrastructure.
>>
>> If there is no other better solutions, I will do this.
>
> Thanks,
>
> M.
>

2016-12-05 09:01:07

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Add a new flag for ITS device to control indirect route

On 05/12/16 03:11, majun (Euler7) wrote:
> Hi Marc:
>
> 在 2016/12/2 17:35, Marc Zyngier 写道:
>> On 02/12/16 09:29, majun (Euler7) wrote:
>>>
>>>
>>> 在 2016/12/1 17:07, Marc Zyngier 写道:
>>>> On 01/12/16 07:45, Majun wrote:
>>>>> From: MaJun <[email protected]>
>>>>>
>>>>> For current ITS driver, two level table (indirect route) is enabled when the memory used
>>>>> for LPI route table over the limit(64KB * 2) size. But this function impact the
>>>>> performance of LPI interrupt actually because need more time to look up the table.
>>>>
>>>> Are you implying that your ITS doesn't have a cache to lookup the most
>>>> active devices, hence performing a full lookup on each interrupt?
>>>
>>> Our ITS chip has the cache with depth 64. But this seems not enough for some
>>> scenario,espeically on virtulization platform.
>>
>> Then I don't see how switching to to flat tables is going to improve
>> things. Can you share actual performance numbers?
>>
> Sorry, I run this code on EMU and have no actual performance numbers now.

So how can you make a decision on what is obviously an optimization for
a given use case?

> Suppose there are 66 devices in system.
> As far as our chip concerned, there are always 2 devices can't benefit from
> cache fully when they report the interrupt.
>
> If i'm wrong, please correct me.

Congratulations, you've just discovered one the limitations of *any*
cache. If your miss rate is too high, then your cache is too small (or
your replacement policy is suboptimal). Switching to flat tables is
going to slightly reduce the miss latency (one read less), but is not
going to improve the miss rate.

I'd suggest you talk to your HW people so that they give you either a
bigger cache or a better replacement policy. Or even put fewer devices
in front of your ITS so that you won't miss in the cache, assuming that
your interrupt latency is so critical that you can't miss once in a
while (which I very seriously doubt).

Thanks,

M.
--
Jazz is not dead. It just smells funny...