2023-09-25 02:52:17

by Wei Gong

[permalink] [raw]
Subject: [PATCH v2] genirq: avoid long loops in handle_edge_irq

When there are a large number of interrupts occurring on the tx
queue(irq smp_affinity=1) of the network card, changing the CPU
affinity of the tx queue (echo 2 > /proc/irq/xx/smp_affinity)
will cause handle_edge_irq to loop for a long time in the
do {} while() loop.

After setting the IRQ CPU affinity, the next interrupt will only
be activated when it arrives. Therefore, the next interrupt will
still be on CPU 0. When a new CPU affinity is activated on CPU 0,
subsequent interrupts will be processed on CPU 1.

cpu 0 cpu 1
- handle_edge_irq
- apic_ack_irq
- irq_do_set_affinity
- handle_edge_irq
- do {
- handle_irq_event
- istate &= ~IRQS_PENDIN
- IRQD_IRQ_INPROGRESS
- spin_unlock()
- spin_lock()
- istate |= IRQS_PENDIN
- handle_irq_event_percpu - mask_ack_irq()
- spin_unlock()
- spin_unlock

} while(IRQS_PENDIN &&
!irq_disable)

Therefore, when determining whether to continue looping, we add a check
to see if the current CPU belongs to the affinity table of the interrupt.

Signed-off-by: Wei Gong <[email protected]>
---
kernel/irq/chip.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index dc94e0bf2c94..6da455e1a692 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
handle_irq_event(desc);

} while ((desc->istate & IRQS_PENDING) &&
- !irqd_irq_disabled(&desc->irq_data));
+ !irqd_irq_disabled(&desc->irq_data) &&
+ cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));

out_unlock:
raw_spin_unlock(&desc->lock);
--
2.32.1 (Apple Git-133)


2023-09-26 17:47:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2] genirq: avoid long loops in handle_edge_irq

On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index dc94e0bf2c94..6da455e1a692 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
> handle_irq_event(desc);
>
> } while ((desc->istate & IRQS_PENDING) &&
> - !irqd_irq_disabled(&desc->irq_data));
> + !irqd_irq_disabled(&desc->irq_data) &&
> + cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));

Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
the effective affinity is on CPU1 then how is this going to move the
interrupt?

Thanks,

tglx

2023-09-27 10:54:57

by Wei Gong

[permalink] [raw]
Subject: Re: [PATCH v2] genirq: avoid long loops in handle_edge_irq

O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> > index dc94e0bf2c94..6da455e1a692 100644
> > --- a/kernel/irq/chip.c
> > +++ b/kernel/irq/chip.c
> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
> > handle_irq_event(desc);
> >
> > } while ((desc->istate & IRQS_PENDING) &&
> > - !irqd_irq_disabled(&desc->irq_data));
> > + !irqd_irq_disabled(&desc->irq_data) &&
> > + cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
>
> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
> the effective affinity is on CPU1 then how is this going to move the
> interrupt?

Loop is on the CPU0 means that the previous effective affinity was on CPU0.
When the previous effective affinity is a subset of the new affinity mask,
the effective affinity will not be updated.
Therefore, I understand that the scenario you mentioned will not occur?

>
> Thanks,
>
> tglx


Thanks,

Wei Gong

2023-09-27 16:49:09

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2] genirq: avoid long loops in handle_edge_irq

On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
> O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
>> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
>> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> > index dc94e0bf2c94..6da455e1a692 100644
>> > --- a/kernel/irq/chip.c
>> > +++ b/kernel/irq/chip.c
>> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
>> > handle_irq_event(desc);
>> >
>> > } while ((desc->istate & IRQS_PENDING) &&
>> > - !irqd_irq_disabled(&desc->irq_data));
>> > + !irqd_irq_disabled(&desc->irq_data) &&
>> > + cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
>>
>> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
>> the effective affinity is on CPU1 then how is this going to move the
>> interrupt?
>
> Loop is on the CPU0 means that the previous effective affinity was on CPU0.
> When the previous effective affinity is a subset of the new affinity mask,
> the effective affinity will not be updated.

That's an implementation detail of a particular interrupt chip driver,
but not a general guaranteed behaviour.

2023-09-28 02:30:23

by Wei Gong

[permalink] [raw]
Subject: Re: [PATCH v2] genirq: avoid long loops in handle_edge_irq

On Wed, Sep 27, 2023 at 05:25:24PM +0200, Thomas Gleixner wrote:
> On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
> > O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
> >> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> >> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> >> > index dc94e0bf2c94..6da455e1a692 100644
> >> > --- a/kernel/irq/chip.c
> >> > +++ b/kernel/irq/chip.c
> >> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
> >> > handle_irq_event(desc);
> >> >
> >> > } while ((desc->istate & IRQS_PENDING) &&
> >> > - !irqd_irq_disabled(&desc->irq_data));
> >> > + !irqd_irq_disabled(&desc->irq_data) &&
> >> > + cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
> >>
> >> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
> >> the effective affinity is on CPU1 then how is this going to move the
> >> interrupt?

Can replacing irq_data_get_affinity_mask with irq_data_get_effective_affinity_mask
solve this issue?

> >
> > Loop is on the CPU0 means that the previous effective affinity was on CPU0.
> > When the previous effective affinity is a subset of the new affinity mask,
> > the effective affinity will not be updated.
>
> That's an implementation detail of a particular interrupt chip driver,
> but not a general guaranteed behaviour.
>

2023-09-28 13:21:28

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2] genirq: avoid long loops in handle_edge_irq

On Thu, Sep 28 2023 at 10:22, Wei Gong wrote:
> On Wed, Sep 27, 2023 at 05:25:24PM +0200, Thomas Gleixner wrote:
>> On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
>> > O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
>> >> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
>> >> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> >> > index dc94e0bf2c94..6da455e1a692 100644
>> >> > --- a/kernel/irq/chip.c
>> >> > +++ b/kernel/irq/chip.c
>> >> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
>> >> > handle_irq_event(desc);
>> >> >
>> >> > } while ((desc->istate & IRQS_PENDING) &&
>> >> > - !irqd_irq_disabled(&desc->irq_data));
>> >> > + !irqd_irq_disabled(&desc->irq_data) &&
>> >> > + cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
>> >>
>> >> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
>> >> the effective affinity is on CPU1 then how is this going to move the
>> >> interrupt?
>> >
>> > Loop is on the CPU0 means that the previous effective affinity was on CPU0.
>> > When the previous effective affinity is a subset of the new affinity mask,
>> > the effective affinity will not be updated.
>>
>> That's an implementation detail of a particular interrupt chip driver,
>> but not a general guaranteed behaviour.
>>
>
> Can replacing irq_data_get_affinity_mask with irq_data_get_effective_affinity_mask
> solve this issue?

Yes.