2020-11-25 11:20:09

by Laurent Vivier

[permalink] [raw]
Subject: [PATCH v2 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries

With virtio, in multiqueue case, each queue IRQ is normally

bound to a different CPU using the affinity mask.



This works fine on x86_64 but totally ignored on pseries.



This is not obvious at first look because irqbalance is doing

some balancing to improve that.



It appears that the "managed" flag set in the MSI entry

is never copied to the system IRQ entry.



This series passes the affinity mask from rtas_setup_msi_irqs()

to irq_domain_alloc_descs() by adding an affinity parameter to

irq_create_mapping().



The first patch adds the parameter (no functional change), the

second patch passes the actual affinity mask to irq_create_mapping()

in rtas_setup_msi_irqs().



For instance, with 32 CPUs VM and 32 queues virtio-scsi interface:



... -smp 32 -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=32



for IRQ in $(grep virtio2-request /proc/interrupts |cut -d: -f1); do

for file in /proc/irq/$IRQ/ ; do

echo -n "IRQ: $(basename $file) CPU: " ; cat $file/smp_affinity_list

done

done



Without the patch (and without irqbalanced)



IRQ: 268 CPU: 0-31

IRQ: 269 CPU: 0-31

IRQ: 270 CPU: 0-31

IRQ: 271 CPU: 0-31

IRQ: 272 CPU: 0-31

IRQ: 273 CPU: 0-31

IRQ: 274 CPU: 0-31

IRQ: 275 CPU: 0-31

IRQ: 276 CPU: 0-31

IRQ: 277 CPU: 0-31

IRQ: 278 CPU: 0-31

IRQ: 279 CPU: 0-31

IRQ: 280 CPU: 0-31

IRQ: 281 CPU: 0-31

IRQ: 282 CPU: 0-31

IRQ: 283 CPU: 0-31

IRQ: 284 CPU: 0-31

IRQ: 285 CPU: 0-31

IRQ: 286 CPU: 0-31

IRQ: 287 CPU: 0-31

IRQ: 288 CPU: 0-31

IRQ: 289 CPU: 0-31

IRQ: 290 CPU: 0-31

IRQ: 291 CPU: 0-31

IRQ: 292 CPU: 0-31

IRQ: 293 CPU: 0-31

IRQ: 294 CPU: 0-31

IRQ: 295 CPU: 0-31

IRQ: 296 CPU: 0-31

IRQ: 297 CPU: 0-31

IRQ: 298 CPU: 0-31

IRQ: 299 CPU: 0-31



With the patch:



IRQ: 265 CPU: 0

IRQ: 266 CPU: 1

IRQ: 267 CPU: 2

IRQ: 268 CPU: 3

IRQ: 269 CPU: 4

IRQ: 270 CPU: 5

IRQ: 271 CPU: 6

IRQ: 272 CPU: 7

IRQ: 273 CPU: 8

IRQ: 274 CPU: 9

IRQ: 275 CPU: 10

IRQ: 276 CPU: 11

IRQ: 277 CPU: 12

IRQ: 278 CPU: 13

IRQ: 279 CPU: 14

IRQ: 280 CPU: 15

IRQ: 281 CPU: 16

IRQ: 282 CPU: 17

IRQ: 283 CPU: 18

IRQ: 284 CPU: 19

IRQ: 285 CPU: 20

IRQ: 286 CPU: 21

IRQ: 287 CPU: 22

IRQ: 288 CPU: 23

IRQ: 289 CPU: 24

IRQ: 290 CPU: 25

IRQ: 291 CPU: 26

IRQ: 292 CPU: 27

IRQ: 293 CPU: 28

IRQ: 294 CPU: 29

IRQ: 295 CPU: 30

IRQ: 299 CPU: 31



This matches what we have on an x86_64 system.



v2: add a wrapper around original irq_create_mapping() with the

affinity parameter. Update comments



Laurent Vivier (2):

genirq: add an irq_create_mapping_affinity() function

powerpc/pseries: pass MSI affinity to irq_create_mapping()



arch/powerpc/platforms/pseries/msi.c | 3 ++-

include/linux/irqdomain.h | 12 ++++++++++--

kernel/irq/irqdomain.c | 13 ++++++++-----

3 files changed, 20 insertions(+), 8 deletions(-)



--

2.28.0





2020-11-25 11:21:36

by Laurent Vivier

[permalink] [raw]
Subject: [PATCH v2 2/2] powerpc/pseries: pass MSI affinity to irq_create_mapping()

With virtio multiqueue, normally each queue IRQ is mapped to a CPU.

But since commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity")
this is broken on pseries.

The affinity is correctly computed in msi_desc but this is not applied
to the system IRQs.

It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
lost at this point and never passed to irq_domain_alloc_descs()
(see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
because irq_create_mapping() doesn't take an affinity parameter.

As the previous patch has added the affinity parameter to
irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
to irq_domain_alloc_descs().

With this change, the virtqueues are correctly dispatched between the CPUs
on pseries.

Signed-off-by: Laurent Vivier <[email protected]>
---
arch/powerpc/platforms/pseries/msi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 133f6adcb39c..b3ac2455faad 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
return hwirq;
}

- virq = irq_create_mapping(NULL, hwirq);
+ virq = irq_create_mapping_affinity(NULL, hwirq,
+ entry->affinity);

if (!virq) {
pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
--
2.28.0

2020-11-25 12:55:54

by Greg Kurz

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] powerpc/pseries: pass MSI affinity to irq_create_mapping()

On Wed, 25 Nov 2020 12:16:57 +0100
Laurent Vivier <[email protected]> wrote:

> With virtio multiqueue, normally each queue IRQ is mapped to a CPU.
>
> But since commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity")
> this is broken on pseries.
>
> The affinity is correctly computed in msi_desc but this is not applied
> to the system IRQs.
>
> It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
> lost at this point and never passed to irq_domain_alloc_descs()
> (see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
> because irq_create_mapping() doesn't take an affinity parameter.
>
> As the previous patch has added the affinity parameter to
> irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
> to irq_domain_alloc_descs().
>
> With this change, the virtqueues are correctly dispatched between the CPUs
> on pseries.
>

Since it is public, maybe add:

BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1702939

?

> Signed-off-by: Laurent Vivier <[email protected]>
> ---

Anyway,

Reviewed-by: Greg Kurz <[email protected]>

> arch/powerpc/platforms/pseries/msi.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
> index 133f6adcb39c..b3ac2455faad 100644
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
> return hwirq;
> }
>
> - virq = irq_create_mapping(NULL, hwirq);
> + virq = irq_create_mapping_affinity(NULL, hwirq,
> + entry->affinity);
>
> if (!virq) {
> pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);