2021-05-01 02:23:16

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

It was pointed out by Nitesh that the original work I did in 2014
to automatically set the interrupt affinity when requesting a
mask is no longer necessary. The kernel has moved on and no
longer has the original problem, BUT the original patch
introduced a subtle bug when booting a system with reserved or
excluded CPUs. Drivers calling this function with a mask value
that included a CPU that was currently or in the future
unavailable would generally not update the hint.

I'm sure there are a million ways to solve this, but the simplest
one is to just remove a little code that tries to force the
affinity, as Nitesh has shown it fixes the bug and doesn't seem
to introduce immediate side effects.

While I'm here, introduce a kernel-doc for the hint function.

Ref: https://lore.kernel.org/lkml/CAFki+L=_dd+JgAR12_eBPX0kZO2_6=1dGdgkwHE=u=K6chMeLQ@mail.gmail.com/
Cc: [email protected]
Fixes: 4fe7ffb7e17c ("genirq: Fix null pointer reference in irq_set_affinity_hint()")
Fixes: e2e64a932556 ("genirq: Set initial affinity in irq_set_affinity_hint()")
Reported-by: Nitesh Lal <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
---

!!! NOTE: Compile tested only, would appreciate feedback

---
kernel/irq/manage.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e976c4927b25..a31df64662d5 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -456,6 +456,16 @@ int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, bool force)
return ret;
}

+/**
+ * irq_set_affinity_hint - set the hint for an irq
+ * @irq: Interrupt for which to set the hint
+ * @m: Mask to indicate which CPUs to suggest for the interrupt, use
+ * NULL here to indicate to clear the value.
+ *
+ * Use this function to recommend which CPU should handle the
+ * interrupt to any userspace that uses /proc/irq/nn/smp_affinity_hint
+ * in order to align interrupts. Pass NULL as the mask to clear the hint.
+ */
int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
{
unsigned long flags;
@@ -465,9 +475,6 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
return -EINVAL;
desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
- /* set the initial affinity to prevent every interrupt being on CPU0 */
- if (m)
- __irq_set_affinity(irq, m, false);
return 0;
}
EXPORT_SYMBOL_GPL(irq_set_affinity_hint);

base-commit: 765822e1569a37aab5e69736c52d4ad4a289eba6
--
2.30.2


2021-05-04 12:17:11

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On 2021-05-01 03:18, Jesse Brandeburg wrote:
> It was pointed out by Nitesh that the original work I did in 2014
> to automatically set the interrupt affinity when requesting a
> mask is no longer necessary. The kernel has moved on and no
> longer has the original problem, BUT the original patch
> introduced a subtle bug when booting a system with reserved or
> excluded CPUs. Drivers calling this function with a mask value
> that included a CPU that was currently or in the future
> unavailable would generally not update the hint.
>
> I'm sure there are a million ways to solve this, but the simplest
> one is to just remove a little code that tries to force the
> affinity, as Nitesh has shown it fixes the bug and doesn't seem
> to introduce immediate side effects.

Unfortunately, I think there are quite a few other drivers now relying
on this behaviour, since they are really using irq_set_affinity_hint()
as a proxy for irq_set_affinity(). Partly since the latter isn't
exported to modules, but also I have a vague memory of it being said
that it's nice to update the user-visible hint to match when the
affinity does have to be forced to something specific.

Robin.

> While I'm here, introduce a kernel-doc for the hint function.
>
> Ref: https://lore.kernel.org/lkml/CAFki+L=_dd+JgAR12_eBPX0kZO2_6=1dGdgkwHE=u=K6chMeLQ@mail.gmail.com/
> Cc: [email protected]
> Fixes: 4fe7ffb7e17c ("genirq: Fix null pointer reference in irq_set_affinity_hint()")
> Fixes: e2e64a932556 ("genirq: Set initial affinity in irq_set_affinity_hint()")
> Reported-by: Nitesh Lal <[email protected]>
> Signed-off-by: Jesse Brandeburg <[email protected]>
> ---
>
> !!! NOTE: Compile tested only, would appreciate feedback
>
> ---
> kernel/irq/manage.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index e976c4927b25..a31df64662d5 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -456,6 +456,16 @@ int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, bool force)
> return ret;
> }
>
> +/**
> + * irq_set_affinity_hint - set the hint for an irq
> + * @irq: Interrupt for which to set the hint
> + * @m: Mask to indicate which CPUs to suggest for the interrupt, use
> + * NULL here to indicate to clear the value.
> + *
> + * Use this function to recommend which CPU should handle the
> + * interrupt to any userspace that uses /proc/irq/nn/smp_affinity_hint
> + * in order to align interrupts. Pass NULL as the mask to clear the hint.
> + */
> int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> {
> unsigned long flags;
> @@ -465,9 +475,6 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> return -EINVAL;
> desc->affinity_hint = m;
> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> - __irq_set_affinity(irq, m, false);
> return 0;
> }
> EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
>
> base-commit: 765822e1569a37aab5e69736c52d4ad4a289eba6
>

2021-05-04 15:06:11

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Tue, May 4, 2021 at 8:15 AM Robin Murphy <[email protected]> wrote:
>
> On 2021-05-01 03:18, Jesse Brandeburg wrote:
> > It was pointed out by Nitesh that the original work I did in 2014
> > to automatically set the interrupt affinity when requesting a
> > mask is no longer necessary. The kernel has moved on and no
> > longer has the original problem, BUT the original patch
> > introduced a subtle bug when booting a system with reserved or
> > excluded CPUs. Drivers calling this function with a mask value
> > that included a CPU that was currently or in the future
> > unavailable would generally not update the hint.
> >
> > I'm sure there are a million ways to solve this, but the simplest
> > one is to just remove a little code that tries to force the
> > affinity, as Nitesh has shown it fixes the bug and doesn't seem
> > to introduce immediate side effects.
>
> Unfortunately, I think there are quite a few other drivers now relying
> on this behaviour, since they are really using irq_set_affinity_hint()
> as a proxy for irq_set_affinity().

That's true.

> Partly since the latter isn't
> exported to modules, but also I have a vague memory of it being said
> that it's nice to update the user-visible hint to match when the
> affinity does have to be forced to something specific.

If you see the downside of it we are forcing the affinity to match the hint
mask without considering the default SMP affinity mask.

Also, we are repeating things here. First, we set certain mask for a device
IRQ via request_irq code path which does consider the default SMP mask but
then we are letting the driver over-write it.

If we want to set the IRQ mask in a certain way then it should be done at
the time of initial setup itself.

Do you know about a workload/use case that can show the benefit of
this behavior? As then we can try fixing it in the right way.

--
Thanks
Nitesh

2021-05-04 16:28:07

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

Robin Murphy wrote:

> On 2021-05-01 03:18, Jesse Brandeburg wrote:
> > It was pointed out by Nitesh that the original work I did in 2014
> > to automatically set the interrupt affinity when requesting a
> > mask is no longer necessary. The kernel has moved on and no
> > longer has the original problem, BUT the original patch
> > introduced a subtle bug when booting a system with reserved or
> > excluded CPUs. Drivers calling this function with a mask value
> > that included a CPU that was currently or in the future
> > unavailable would generally not update the hint.
> >
> > I'm sure there are a million ways to solve this, but the simplest
> > one is to just remove a little code that tries to force the
> > affinity, as Nitesh has shown it fixes the bug and doesn't seem
> > to introduce immediate side effects.
>
> Unfortunately, I think there are quite a few other drivers now relying
> on this behaviour, since they are really using irq_set_affinity_hint()
> as a proxy for irq_set_affinity(). Partly since the latter isn't
> exported to modules, but also I have a vague memory of it being said
> that it's nice to update the user-visible hint to match when the
> affinity does have to be forced to something specific.
>
> Robin.

Thanks for your feedback Robin, but there is definitely a bug here that
is being exposed by this code. The fact that people are using this
function means they're all exposed to this bug.

Not sure if you saw, but this analysis from Nitesh explains what
happened chronologically to the kernel w.r.t this code, it's a useful
analysis! [1]

I'd add in addition that irqbalance daemon *stopped* paying attention
to hints quite a while ago, so I'm not quite sure what purpose they
serve.

[1]
https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA@mail.gmail.com/

2021-05-18 20:41:16

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Tue, May 4, 2021 at 12:25 PM Jesse Brandeburg
<[email protected]> wrote:
>
> Robin Murphy wrote:
>
> > On 2021-05-01 03:18, Jesse Brandeburg wrote:
> > > It was pointed out by Nitesh that the original work I did in 2014
> > > to automatically set the interrupt affinity when requesting a
> > > mask is no longer necessary. The kernel has moved on and no
> > > longer has the original problem, BUT the original patch
> > > introduced a subtle bug when booting a system with reserved or
> > > excluded CPUs. Drivers calling this function with a mask value
> > > that included a CPU that was currently or in the future
> > > unavailable would generally not update the hint.
> > >
> > > I'm sure there are a million ways to solve this, but the simplest
> > > one is to just remove a little code that tries to force the
> > > affinity, as Nitesh has shown it fixes the bug and doesn't seem
> > > to introduce immediate side effects.
> >
> > Unfortunately, I think there are quite a few other drivers now relying
> > on this behaviour, since they are really using irq_set_affinity_hint()
> > as a proxy for irq_set_affinity(). Partly since the latter isn't
> > exported to modules, but also I have a vague memory of it being said
> > that it's nice to update the user-visible hint to match when the
> > affinity does have to be forced to something specific.
> >
> > Robin.
>
> Thanks for your feedback Robin, but there is definitely a bug here that
> is being exposed by this code. The fact that people are using this
> function means they're all exposed to this bug.
>
> Not sure if you saw, but this analysis from Nitesh explains what
> happened chronologically to the kernel w.r.t this code, it's a useful
> analysis! [1]
>
> I'd add in addition that irqbalance daemon *stopped* paying attention
> to hints quite a while ago, so I'm not quite sure what purpose they
> serve.
>
> [1]
> https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA@mail.gmail.com/
>

Wanted to follow up to see if there are any more objections or even
suggestions to take this forward?

--
Thanks
Nitesh


2021-05-18 21:09:57

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On 2021-05-17 17:57, Nitesh Lal wrote:
> On Tue, May 4, 2021 at 12:25 PM Jesse Brandeburg
> <[email protected]> wrote:
>>
>> Robin Murphy wrote:
>>
>>> On 2021-05-01 03:18, Jesse Brandeburg wrote:
>>>> It was pointed out by Nitesh that the original work I did in 2014
>>>> to automatically set the interrupt affinity when requesting a
>>>> mask is no longer necessary. The kernel has moved on and no
>>>> longer has the original problem, BUT the original patch
>>>> introduced a subtle bug when booting a system with reserved or
>>>> excluded CPUs. Drivers calling this function with a mask value
>>>> that included a CPU that was currently or in the future
>>>> unavailable would generally not update the hint.
>>>>
>>>> I'm sure there are a million ways to solve this, but the simplest
>>>> one is to just remove a little code that tries to force the
>>>> affinity, as Nitesh has shown it fixes the bug and doesn't seem
>>>> to introduce immediate side effects.
>>>
>>> Unfortunately, I think there are quite a few other drivers now relying
>>> on this behaviour, since they are really using irq_set_affinity_hint()
>>> as a proxy for irq_set_affinity(). Partly since the latter isn't
>>> exported to modules, but also I have a vague memory of it being said
>>> that it's nice to update the user-visible hint to match when the
>>> affinity does have to be forced to something specific.
>>>
>>> Robin.
>>
>> Thanks for your feedback Robin, but there is definitely a bug here that
>> is being exposed by this code. The fact that people are using this
>> function means they're all exposed to this bug.
>>
>> Not sure if you saw, but this analysis from Nitesh explains what
>> happened chronologically to the kernel w.r.t this code, it's a useful
>> analysis! [1]
>>
>> I'd add in addition that irqbalance daemon *stopped* paying attention
>> to hints quite a while ago, so I'm not quite sure what purpose they
>> serve.
>>
>> [1]
>> https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA@mail.gmail.com/
>>
>
> Wanted to follow up to see if there are any more objections or even
> suggestions to take this forward?

Oops, sorry, seems I got distracted before getting round to actually
typing up my response :)

I'm not implying that there isn't a bug, or that this code ever made
sense in the first place, just that fixing it will unfortunately be a
bit more involved than a simple revert. This patch as-is *will* subtly
break at least the system PMU drivers currently using
irq_set_affinity_hint() - those I know require the IRQ affinity to
follow whichever CPU the PMU context is bound to, in order to meet perf
core's assumptions about mutual exclusion.

As far as the consistency argument goes, maybe that's just backwards and
it should be irq_set_affinity() that also sets the hint, to indicate to
userspace that the affinity has been forced by the kernel? Either way
we'll need to do a little more diligence to figure out which callers
actually care about more than just the hint, and sort them out first.

Robin.

2021-05-18 22:31:25

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17 2021 at 18:26, Robin Murphy wrote:
> On 2021-05-17 17:57, Nitesh Lal wrote:
> I'm not implying that there isn't a bug, or that this code ever made
> sense in the first place, just that fixing it will unfortunately be a
> bit more involved than a simple revert. This patch as-is *will* subtly
> break at least the system PMU drivers currently using

s/using/abusing/

> irq_set_affinity_hint() - those I know require the IRQ affinity to
> follow whichever CPU the PMU context is bound to, in order to meet perf
> core's assumptions about mutual exclusion.

Which driver is that?

Thanks,

tglx

2021-05-18 23:29:21

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17, 2021 at 1:26 PM Robin Murphy <[email protected]> wrote:
>
> On 2021-05-17 17:57, Nitesh Lal wrote:
> > On Tue, May 4, 2021 at 12:25 PM Jesse Brandeburg
> > <[email protected]> wrote:
> >>
> >> Robin Murphy wrote:
> >>
> >>> On 2021-05-01 03:18, Jesse Brandeburg wrote:
> >>>> It was pointed out by Nitesh that the original work I did in 2014
> >>>> to automatically set the interrupt affinity when requesting a
> >>>> mask is no longer necessary. The kernel has moved on and no
> >>>> longer has the original problem, BUT the original patch
> >>>> introduced a subtle bug when booting a system with reserved or
> >>>> excluded CPUs. Drivers calling this function with a mask value
> >>>> that included a CPU that was currently or in the future
> >>>> unavailable would generally not update the hint.
> >>>>
> >>>> I'm sure there are a million ways to solve this, but the simplest
> >>>> one is to just remove a little code that tries to force the
> >>>> affinity, as Nitesh has shown it fixes the bug and doesn't seem
> >>>> to introduce immediate side effects.
> >>>
> >>> Unfortunately, I think there are quite a few other drivers now relying
> >>> on this behaviour, since they are really using irq_set_affinity_hint()
> >>> as a proxy for irq_set_affinity(). Partly since the latter isn't
> >>> exported to modules, but also I have a vague memory of it being said
> >>> that it's nice to update the user-visible hint to match when the
> >>> affinity does have to be forced to something specific.
> >>>
> >>> Robin.
> >>
> >> Thanks for your feedback Robin, but there is definitely a bug here that
> >> is being exposed by this code. The fact that people are using this
> >> function means they're all exposed to this bug.
> >>
> >> Not sure if you saw, but this analysis from Nitesh explains what
> >> happened chronologically to the kernel w.r.t this code, it's a useful
> >> analysis! [1]
> >>
> >> I'd add in addition that irqbalance daemon *stopped* paying attention
> >> to hints quite a while ago, so I'm not quite sure what purpose they
> >> serve.
> >>
> >> [1]
> >> https://lore.kernel.org/lkml/CAFki+Lm0W_brLu31epqD3gAV+WNKOJfVDfX2M8ZM__aj3nv9uA@mail.gmail.com/
> >>
> >
> > Wanted to follow up to see if there are any more objections or even
> > suggestions to take this forward?
>
> Oops, sorry, seems I got distracted before getting round to actually
> typing up my response :)

No worries.

>
> I'm not implying that there isn't a bug, or that this code ever made
> sense in the first place, just that fixing it will unfortunately be a
> bit more involved than a simple revert.

Fair point.

> This patch as-is *will* subtly
> break at least the system PMU drivers currently using
> irq_set_affinity_hint() - those I know require the IRQ affinity to
> follow whichever CPU the PMU context is bound to, in order to meet perf
> core's assumptions about mutual exclusion.

Thanks for bringing this up.
Please correct me if I am wrong, so the PMU driver(s) is/are written
in a way that
it uses the hint API to overwrite the previously set affinity mask with a
CPU to which the PMU context is bound to?

Is this context information exposed in the userspace and can we modify the
IRQ affinity mask from the userspace based on that?
I do understand that this is a behavior change from the PMU drivers
perspective.

>
> As far as the consistency argument goes, maybe that's just backwards and
> it should be irq_set_affinity() that also sets the hint, to indicate to
> userspace that the affinity has been forced by the kernel? Either way
> we'll need to do a little more diligence to figure out which callers
> actually care about more than just the hint, and sort them out first.
>

We can use irq_set_affinity() to set the hint mask as well, however, maybe
there is a specific reason behind separating those two in the
first place (maybe not?).
But even in this case, we have to either modify the PMU drivers' IRQs
affinity from the userspace or we will have to make changes in the existing
request_irq code path.
I am not sure about the latter because we already have the required controls
to adjust the device IRQ mask (by using default_smp_affinity or by modifying
them manually).

--
Thanks
Nitesh


2021-05-19 01:10:35

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On 2021-05-17 19:08, Thomas Gleixner wrote:
> On Mon, May 17 2021 at 18:26, Robin Murphy wrote:
>> On 2021-05-17 17:57, Nitesh Lal wrote:
>> I'm not implying that there isn't a bug, or that this code ever made
>> sense in the first place, just that fixing it will unfortunately be a
>> bit more involved than a simple revert. This patch as-is *will* subtly
>> break at least the system PMU drivers currently using
>
> s/using/abusing/
>
>> irq_set_affinity_hint() - those I know require the IRQ affinity to
>> follow whichever CPU the PMU context is bound to, in order to meet perf
>> core's assumptions about mutual exclusion.
>
> Which driver is that?

Right now, any driver which wants to control an IRQ's affinity and also
build as a module, for one thing. I'm familiar with drivers/perf/ where
a basic pattern has been widely copied; some of the callers in other
subsystems appear to *expect* it to set the underlying affinity as well,
but whether any of those added within the last 6 years represent a
functional dependency rather than just a performance concern I don't know.

Robin.

2021-05-19 01:19:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17 2021 at 19:50, Robin Murphy wrote:

> On 2021-05-17 19:08, Thomas Gleixner wrote:
>> On Mon, May 17 2021 at 18:26, Robin Murphy wrote:
>>> On 2021-05-17 17:57, Nitesh Lal wrote:
>>> I'm not implying that there isn't a bug, or that this code ever made
>>> sense in the first place, just that fixing it will unfortunately be a
>>> bit more involved than a simple revert. This patch as-is *will* subtly
>>> break at least the system PMU drivers currently using
>>
>> s/using/abusing/
>>
>>> irq_set_affinity_hint() - those I know require the IRQ affinity to
>>> follow whichever CPU the PMU context is bound to, in order to meet perf
>>> core's assumptions about mutual exclusion.
>>
>> Which driver is that?
>
> Right now, any driver which wants to control an IRQ's affinity and also
> build as a module, for one thing. I'm familiar with drivers/perf/ where
> a basic pattern has been widely copied;

Bah. Why the heck can't people talk and just go and rumage until they
find something which hopefully does what they want...

The name of that function should have rang all alarm bells...

> some of the callers in other subsystems appear to *expect* it to set
> the underlying affinity as well, but whether any of those added within
> the last 6 years represent a functional dependency rather than just a
> performance concern I don't know.

Sigh. Let me do yet another tree wide audit...

Thanks,

tglx



2021-05-19 02:22:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17 2021 at 21:08, Thomas Gleixner wrote:
> On Mon, May 17 2021 at 19:50, Robin Murphy wrote:
>> some of the callers in other subsystems appear to *expect* it to set
>> the underlying affinity as well, but whether any of those added within
>> the last 6 years represent a functional dependency rather than just a
>> performance concern I don't know.
>
> Sigh. Let me do yet another tree wide audit...

It's clearly only the perf muck which has a functional dependency.

None of the other usage sites has IRQF_NOBALANCING set which clearly
makes this a hint because user space can freely muck with the affinity.

Thanks,

tglx




2021-05-19 02:23:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

Nitesh,

On Mon, May 17 2021 at 14:21, Nitesh Lal wrote:
> On Mon, May 17, 2021 at 1:26 PM Robin Murphy <[email protected]> wrote:
>
> We can use irq_set_affinity() to set the hint mask as well, however, maybe
> there is a specific reason behind separating those two in the
> first place (maybe not?).

Yes, because kernel side settings might overwrite the hint.

> But even in this case, we have to either modify the PMU drivers' IRQs
> affinity from the userspace or we will have to make changes in the existing
> request_irq code path.

Adjusting them from user space does not work for various reasons,
especially CPU hotplug.

> I am not sure about the latter because we already have the required controls
> to adjust the device IRQ mask (by using default_smp_affinity or by modifying
> them manually).

default_smp_affinity does not help at all and there is nothing a module
can modify manually.

I'll send out a patch series which cleans that up soon.

Thanks,

tglx



2021-05-19 04:57:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17 2021 at 21:08, Thomas Gleixner wrote:
> On Mon, May 17 2021 at 19:50, Robin Murphy wrote:
>> On 2021-05-17 19:08, Thomas Gleixner wrote:
>>> On Mon, May 17 2021 at 18:26, Robin Murphy wrote:
>>>> On 2021-05-17 17:57, Nitesh Lal wrote:
>>>> I'm not implying that there isn't a bug, or that this code ever made
>>>> sense in the first place, just that fixing it will unfortunately be a
>>>> bit more involved than a simple revert. This patch as-is *will* subtly
>>>> break at least the system PMU drivers currently using
>>>
>>> s/using/abusing/
>>>
>>>> irq_set_affinity_hint() - those I know require the IRQ affinity to
>>>> follow whichever CPU the PMU context is bound to, in order to meet perf
>>>> core's assumptions about mutual exclusion.
>>>
>>> Which driver is that?
>>
>> Right now, any driver which wants to control an IRQ's affinity and also
>> build as a module, for one thing. I'm familiar with drivers/perf/ where
>> a basic pattern has been widely copied;
>
> Bah. Why the heck can't people talk and just go and rumage until they
> find something which hopefully does what they want...
>
> The name of that function should have rang all alarm bells...

Aside of that all the warnings around the return value are useless cargo
cult. Why?

The only reason why this function returns an error code is when there is
no irq descriptor assigned to the interrupt number, which is well close
to impossible in that context.

But it does _NOT_ return an error when the actual affinity setting
fails...

Thanks,

tglx

2021-05-19 08:06:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Tue, May 04 2021 at 09:23, Jesse Brandeburg wrote:
> I'd add in addition that irqbalance daemon *stopped* paying attention
> to hints quite a while ago, so I'm not quite sure what purpose they
> serve.

The hint was added so that userspace has a better understanding where it
should place the interrupt. So if irqbalanced ignores it anyway, then
what's the point of the hint? IOW, why is it still used drivers?

Now there is another aspect to that. What happens if irqbalanced does
not run at all and a driver relies on the side effect of the hint
setting the initial affinity. Bah...

While none of the drivers (except the perf muck) actually prevents
userspace from fiddling with the affinity (via IRQF_NOBALANCING) a
deeper inspection shows that they actually might rely on the current
behaviour if irqbalanced is disabled. Of course every driver has its own
convoluted way to do that and all of those functions are well
documented. What a mess.

If the hint still serves a purpose then we can provide a variant which
solely applies the hint and does not fiddle with the actual affinity,
but if the hint is useless anyway then we have a way better option to
clean that up.

Most users are in networking, there are a few in crypto, a couple of
leftovers in scsi, virtio and a handfull of oddball drivers.

The perf muck wants to be cleaned up anyway as it's just crystal clear
abuse.

Thanks,

tglx

2021-05-19 08:09:10

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17, 2021 at 3:47 PM Thomas Gleixner <[email protected]> wrote:
>
> Nitesh,
>
> On Mon, May 17 2021 at 14:21, Nitesh Lal wrote:
> > On Mon, May 17, 2021 at 1:26 PM Robin Murphy <[email protected]> wrote:
> >
> > We can use irq_set_affinity() to set the hint mask as well, however, maybe
> > there is a specific reason behind separating those two in the
> > first place (maybe not?).
>
> Yes, because kernel side settings might overwrite the hint.
>
> > But even in this case, we have to either modify the PMU drivers' IRQs
> > affinity from the userspace or we will have to make changes in the existing
> > request_irq code path.
>
> Adjusting them from user space does not work for various reasons,
> especially CPU hotplug.
>
> > I am not sure about the latter because we already have the required controls
> > to adjust the device IRQ mask (by using default_smp_affinity or by modifying
> > them manually).
>
> default_smp_affinity does not help at all and there is nothing a module
> can modify manually.

Right, it will not help a module.

>
> I'll send out a patch series which cleans that up soon.

Ack, thanks.

>
> Thanks,
>
> tglx
>
>


--
Nitesh


2021-05-19 09:04:03

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17, 2021 at 4:48 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, May 04 2021 at 09:23, Jesse Brandeburg wrote:
> > I'd add in addition that irqbalance daemon *stopped* paying attention
> > to hints quite a while ago, so I'm not quite sure what purpose they
> > serve.
>
> The hint was added so that userspace has a better understanding where it
> should place the interrupt. So if irqbalanced ignores it anyway, then
> what's the point of the hint? IOW, why is it still used drivers?
>

Took a quick look at the irqbalance repo and saw the following commit:

dcc411e7bf remove affinity_hint infrastructure

The commit message mentions that "PJ is redesiging how affinity hinting
works in the kernel, the future model will just tell us to ignore an IRQ,
and the kernel will handle placement for us. As such we can remove the
affinity_hint recognition entirely".

This does indicate that apparently, irqbalance moved away from the usage of
affinity_hint. However, the next question is what was this future model?
I don't know but I can surely look into it if that helps or maybe someone
here already knows about it?

> Now there is another aspect to that. What happens if irqbalanced does
> not run at all and a driver relies on the side effect of the hint
> setting the initial affinity. Bah...
>

Right, but if they only rely on this API so that the IRQs are spread across
all the CPUs then that issue is already resolved and these other drivers
should not regress because of changing this behavior. Isn't it?

> While none of the drivers (except the perf muck) actually prevents
> userspace from fiddling with the affinity (via IRQF_NOBALANCING) a
> deeper inspection shows that they actually might rely on the current
> behaviour if irqbalanced is disabled. Of course every driver has its own
> convoluted way to do that and all of those functions are well
> documented. What a mess.
>
> If the hint still serves a purpose then we can provide a variant which
> solely applies the hint and does not fiddle with the actual affinity,
> but if the hint is useless anyway then we have a way better option to
> clean that up.
>

+1



--
Thanks
Nitesh


2021-05-19 09:20:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17 2021 at 18:44, Nitesh Lal wrote:
> On Mon, May 17, 2021 at 4:48 PM Thomas Gleixner <[email protected]> wrote:
>> The hint was added so that userspace has a better understanding where it
>> should place the interrupt. So if irqbalanced ignores it anyway, then
>> what's the point of the hint? IOW, why is it still used drivers?
>>
> Took a quick look at the irqbalance repo and saw the following commit:
>
> dcc411e7bf remove affinity_hint infrastructure
>
> The commit message mentions that "PJ is redesiging how affinity hinting
> works in the kernel, the future model will just tell us to ignore an IRQ,
> and the kernel will handle placement for us. As such we can remove the
> affinity_hint recognition entirely".

No idea who PJ is. I really love useful commit messages. Maybe Neil can
shed some light on that.

> This does indicate that apparently, irqbalance moved away from the usage of
> affinity_hint. However, the next question is what was this future
> model?

I might have missed something in the last 5 years, but that's the first
time I hear about someone trying to cleanup that thing.

> I don't know but I can surely look into it if that helps or maybe someone
> here already knows about it?

I CC'ed Neil :)

>> Now there is another aspect to that. What happens if irqbalanced does
>> not run at all and a driver relies on the side effect of the hint
>> setting the initial affinity. Bah...
>>
>
> Right, but if they only rely on this API so that the IRQs are spread across
> all the CPUs then that issue is already resolved and these other drivers
> should not regress because of changing this behavior. Isn't it?

Is that true for all architectures?

>> While none of the drivers (except the perf muck) actually prevents
>> userspace from fiddling with the affinity (via IRQF_NOBALANCING) a
>> deeper inspection shows that they actually might rely on the current
>> behaviour if irqbalanced is disabled. Of course every driver has its own
>> convoluted way to do that and all of those functions are well
>> documented. What a mess.
>>
>> If the hint still serves a purpose then we can provide a variant which
>> solely applies the hint and does not fiddle with the actual affinity,
>> but if the hint is useless anyway then we have a way better option to
>> clean that up.
>>
>
> +1

= 1

Thanks,

tglx

2021-05-19 09:21:06

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17, 2021 at 8:04 PM Thomas Gleixner <[email protected]> wrote:
>
> On Mon, May 17 2021 at 18:44, Nitesh Lal wrote:
> > On Mon, May 17, 2021 at 4:48 PM Thomas Gleixner <[email protected]> wrote:
> >> The hint was added so that userspace has a better understanding where it
> >> should place the interrupt. So if irqbalanced ignores it anyway, then
> >> what's the point of the hint? IOW, why is it still used drivers?
> >>
> > Took a quick look at the irqbalance repo and saw the following commit:
> >
> > dcc411e7bf remove affinity_hint infrastructure
> >
> > The commit message mentions that "PJ is redesiging how affinity hinting
> > works in the kernel, the future model will just tell us to ignore an IRQ,
> > and the kernel will handle placement for us. As such we can remove the
> > affinity_hint recognition entirely".
>
> No idea who PJ is. I really love useful commit messages. Maybe Neil can
> shed some light on that.
>
> > This does indicate that apparently, irqbalance moved away from the usage of
> > affinity_hint. However, the next question is what was this future
> > model?
>
> I might have missed something in the last 5 years, but that's the first
> time I hear about someone trying to cleanup that thing.
>
> > I don't know but I can surely look into it if that helps or maybe someone
> > here already knows about it?
>
> I CC'ed Neil :)

Thanks, I have added PJ Waskiewicz as well who I think was referred in
that commit message as PJ.

>
> >> Now there is another aspect to that. What happens if irqbalanced does
> >> not run at all and a driver relies on the side effect of the hint
> >> setting the initial affinity. Bah...
> >>
> >
> > Right, but if they only rely on this API so that the IRQs are spread across
> > all the CPUs then that issue is already resolved and these other drivers
> > should not regress because of changing this behavior. Isn't it?
>
> Is that true for all architectures?

Unfortunately, I don't know and that's probably why we have to be careful.

--
Nitesh


2021-05-21 08:17:13

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Mon, May 17, 2021 at 8:23 PM Nitesh Lal <[email protected]> wrote:
>
> On Mon, May 17, 2021 at 8:04 PM Thomas Gleixner <[email protected]> wrote:
> >
> > On Mon, May 17 2021 at 18:44, Nitesh Lal wrote:
> > > On Mon, May 17, 2021 at 4:48 PM Thomas Gleixner <[email protected]> wrote:
> > >> The hint was added so that userspace has a better understanding where it
> > >> should place the interrupt. So if irqbalanced ignores it anyway, then
> > >> what's the point of the hint? IOW, why is it still used drivers?
> > >>
> > > Took a quick look at the irqbalance repo and saw the following commit:
> > >
> > > dcc411e7bf remove affinity_hint infrastructure
> > >
> > > The commit message mentions that "PJ is redesiging how affinity hinting
> > > works in the kernel, the future model will just tell us to ignore an IRQ,
> > > and the kernel will handle placement for us. As such we can remove the
> > > affinity_hint recognition entirely".
> >
> > No idea who PJ is. I really love useful commit messages. Maybe Neil can
> > shed some light on that.
> >
> > > This does indicate that apparently, irqbalance moved away from the usage of
> > > affinity_hint. However, the next question is what was this future
> > > model?
> >
> > I might have missed something in the last 5 years, but that's the first
> > time I hear about someone trying to cleanup that thing.
> >
> > > I don't know but I can surely look into it if that helps or maybe someone
> > > here already knows about it?
> >
> > I CC'ed Neil :)
>
> Thanks, I have added PJ Waskiewicz as well who I think was referred in
> that commit message as PJ.
>
> >
> > >> Now there is another aspect to that. What happens if irqbalanced does
> > >> not run at all and a driver relies on the side effect of the hint
> > >> setting the initial affinity. Bah...
> > >>
> > >
> > > Right, but if they only rely on this API so that the IRQs are spread across
> > > all the CPUs then that issue is already resolved and these other drivers
> > > should not regress because of changing this behavior. Isn't it?
> >
> > Is that true for all architectures?
>
> Unfortunately, I don't know and that's probably why we have to be careful.

I think here to ensure that we are not breaking any of the drivers we have
to first analyze all the existing drivers and understand how they are using
this API.
AFAIK there are three possible scenarios:

- A driver use this API to spread the IRQs
+ For this case we should be safe considering the spreading is naturally
done from the IRQ subsystem itself.

- A driver use this API to actually set the hint
+ These drivers should have no functional impact because of this revert

- Driver use this API to force a certain affinity mask
+ In this case we have to replace the API with the irq_force_affinity()

I can start looking into the individual drivers, however, testing them will
be a challenge.

Any thoughts?

--
Thanks
Nitesh

2021-05-21 12:42:51

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

Nitesh,

On Thu, May 20 2021 at 20:03, Nitesh Lal wrote:
> On Thu, May 20, 2021 at 5:57 PM Nitesh Lal <[email protected]> wrote:
>> I think here to ensure that we are not breaking any of the drivers we have
>> to first analyze all the existing drivers and understand how they are using
>> this API.
>> AFAIK there are three possible scenarios:
>>
>> - A driver use this API to spread the IRQs
>> + For this case we should be safe considering the spreading is naturally
>> done from the IRQ subsystem itself.
>
> Forgot to mention another thing in the above case is to determine whether
> it is true for all architectures or not as Thomas mentioned.

Yes.

>>
>> - A driver use this API to actually set the hint
>> + These drivers should have no functional impact because of this revert

Correct.


>> - Driver use this API to force a certain affinity mask
>> + In this case we have to replace the API with the irq_force_affinity()

irq_set_affinity() or irq_set_affinity_and_hint()

>> I can start looking into the individual drivers, however, testing them will
>> be a challenge.

The only way to do that is to have the core infrastructure added and
then send patches changing it in the way you think. The relevant
maintainers/developers should be able to tell you when your analysis
went south. :)

Been there, done that. It's just lots of work :)

Thanks,

tglx

2021-05-21 12:48:33

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] genirq: Provide new interfaces for affinity hints

The discussion about removing the side effect of irq_set_affinity_hint() of
actually applying the cpumask (if not NULL) as affinity to the interrupt,
unearthed a few unpleasantries:

1) The modular perf drivers rely on the current behaviour for the very
wrong reasons.

2) While none of the other drivers prevents user space from changing
the affinity, a cursorily inspection shows that there are at least
expectations in some drivers.

#1 needs to be cleaned up anyway, so that's not a problem

#2 might result in subtle regressions especially when irqbalanced (which
nowadays ignores the affinity hint) is disabled.

Provide new interfaces:

irq_update_affinity_hint() - Only sets the affinity hint pointer
irq_apply_affinity_hint() - Set the pointer and apply the affinity to
the interrupt

Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
document it to be phased out.

Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Applies on:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
---
include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
kernel/irq/manage.c | 8 ++++----
2 files changed, 44 insertions(+), 5 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
extern int irq_can_set_affinity(unsigned int irq);
extern int irq_select_affinity(unsigned int irq);

-extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
+extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity);
+
+/**
+ * irq_update_affinity_hint - Update the affinity hint
+ * @irq: Interrupt to update
+ * @cpumask: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint, but does not change the affinity of the interrupt.
+ */
+static inline int
+irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, true);
+}
+
+/**
+ * irq_apply_affinity_hint - Update the affinity hint and apply the provided
+ * cpumask to the interrupt
+ * @irq: Interrupt to update
+ * @cpumask: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint and if @cpumask is not NULL it applies it as
+ * the affinity of that interrupt.
+ */
+static inline int
+irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, true);
+}
+
+/*
+ * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
+ * instead.
+ */
+static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return irq_apply_affinity_hint(irq, cpumask);
+}
+
extern int irq_update_affinity_desc(unsigned int irq,
struct irq_affinity_desc *affinity);

--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
}
EXPORT_SYMBOL_GPL(irq_force_affinity);

-int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
@@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
return -EINVAL;
desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
- /* set the initial affinity to prevent every interrupt being on CPU0 */
- if (m)
+ if (m && setaffinity)
__irq_set_affinity(irq, m, false);
return 0;
}
-EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
+EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);

static void irq_affinity_notify(struct work_struct *work)
{

2021-05-21 13:49:42

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Fri, May 21, 2021 at 7:56 AM Thomas Gleixner <[email protected]> wrote:
>
> Nitesh,
>
> On Thu, May 20 2021 at 20:03, Nitesh Lal wrote:
> > On Thu, May 20, 2021 at 5:57 PM Nitesh Lal <[email protected]> wrote:
> >> I think here to ensure that we are not breaking any of the drivers we have
> >> to first analyze all the existing drivers and understand how they are using
> >> this API.
> >> AFAIK there are three possible scenarios:
> >>
> >> - A driver use this API to spread the IRQs
> >> + For this case we should be safe considering the spreading is naturally
> >> done from the IRQ subsystem itself.
> >
> > Forgot to mention another thing in the above case is to determine whether
> > it is true for all architectures or not as Thomas mentioned.
>
> Yes.
>
> >>
> >> - A driver use this API to actually set the hint
> >> + These drivers should have no functional impact because of this revert
>
> Correct.
>
>
> >> - Driver use this API to force a certain affinity mask
> >> + In this case we have to replace the API with the irq_force_affinity()
>
> irq_set_affinity() or irq_set_affinity_and_hint()

Ah yes! my bad. _force_ doesn't check the mask against the online CPUs.
Hmm, I didn't realize that you also added irq_set_affinity_and_hint()
in your last patchset.

>
> >> I can start looking into the individual drivers, however, testing them will
> >> be a challenge.
>
> The only way to do that is to have the core infrastructure added and

Right.

> then send patches changing it in the way you think. The relevant
> maintainers/developers should be able to tell you when your analysis
> went south. :)

Ack will start looking into this.


--
Thanks
Nitesh

2021-05-21 20:05:14

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Thu, May 20, 2021 at 5:57 PM Nitesh Lal <[email protected]> wrote:
>
> On Mon, May 17, 2021 at 8:23 PM Nitesh Lal <[email protected]> wrote:
> >
> > On Mon, May 17, 2021 at 8:04 PM Thomas Gleixner <[email protected]> wrote:
> > >
> > > On Mon, May 17 2021 at 18:44, Nitesh Lal wrote:
> > > > On Mon, May 17, 2021 at 4:48 PM Thomas Gleixner <[email protected]> wrote:
> > > >> The hint was added so that userspace has a better understanding where it
> > > >> should place the interrupt. So if irqbalanced ignores it anyway, then
> > > >> what's the point of the hint? IOW, why is it still used drivers?
> > > >>
> > > > Took a quick look at the irqbalance repo and saw the following commit:
> > > >
> > > > dcc411e7bf remove affinity_hint infrastructure
> > > >
> > > > The commit message mentions that "PJ is redesiging how affinity hinting
> > > > works in the kernel, the future model will just tell us to ignore an IRQ,
> > > > and the kernel will handle placement for us. As such we can remove the
> > > > affinity_hint recognition entirely".
> > >
> > > No idea who PJ is. I really love useful commit messages. Maybe Neil can
> > > shed some light on that.
> > >
> > > > This does indicate that apparently, irqbalance moved away from the usage of
> > > > affinity_hint. However, the next question is what was this future
> > > > model?
> > >
> > > I might have missed something in the last 5 years, but that's the first
> > > time I hear about someone trying to cleanup that thing.
> > >
> > > > I don't know but I can surely look into it if that helps or maybe someone
> > > > here already knows about it?
> > >
> > > I CC'ed Neil :)
> >
> > Thanks, I have added PJ Waskiewicz as well who I think was referred in
> > that commit message as PJ.
> >
> > >
> > > >> Now there is another aspect to that. What happens if irqbalanced does
> > > >> not run at all and a driver relies on the side effect of the hint
> > > >> setting the initial affinity. Bah...
> > > >>
> > > >
> > > > Right, but if they only rely on this API so that the IRQs are spread across
> > > > all the CPUs then that issue is already resolved and these other drivers
> > > > should not regress because of changing this behavior. Isn't it?
> > >
> > > Is that true for all architectures?
> >
> > Unfortunately, I don't know and that's probably why we have to be careful.
>
> I think here to ensure that we are not breaking any of the drivers we have
> to first analyze all the existing drivers and understand how they are using
> this API.
> AFAIK there are three possible scenarios:
>
> - A driver use this API to spread the IRQs
> + For this case we should be safe considering the spreading is naturally
> done from the IRQ subsystem itself.

Forgot to mention another thing in the above case is to determine whether
it is true for all architectures or not as Thomas mentioned.

>
> - A driver use this API to actually set the hint
> + These drivers should have no functional impact because of this revert
>
> - Driver use this API to force a certain affinity mask
> + In this case we have to replace the API with the irq_force_affinity()
>
> I can start looking into the individual drivers, however, testing them will
> be a challenge.
>
> Any thoughts?
>
> --
> Thanks
> Nitesh



--
Thanks
Nitesh

2021-05-21 20:20:09

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH tip:irq/core v1] genirq: remove auto-set of the mask when setting the hint

On Fri, May 21 2021 at 09:46, Nitesh Lal wrote:
> On Fri, May 21, 2021 at 7:56 AM Thomas Gleixner <[email protected]> wrote:
>> >> - Driver use this API to force a certain affinity mask
>> >> + In this case we have to replace the API with the irq_force_affinity()
>>
>> irq_set_affinity() or irq_set_affinity_and_hint()
>
> Ah yes! my bad. _force_ doesn't check the mask against the online CPUs.
> Hmm, I didn't realize that you also added irq_set_affinity_and_hint()
> in your last patchset.

I did not. It just exposed irq_set_affinity().

See https://lore.kernel.org/r/[email protected]
for the new hint interface I came up with.

Thanks,

tglx

2021-05-21 20:21:55

by Lijun Pan

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21, 2021 at 7:48 AM Thomas Gleixner <[email protected]> wrote:
>
> The discussion about removing the side effect of irq_set_affinity_hint() of
> actually applying the cpumask (if not NULL) as affinity to the interrupt,
> unearthed a few unpleasantries:
>
> 1) The modular perf drivers rely on the current behaviour for the very
> wrong reasons.
>
> 2) While none of the other drivers prevents user space from changing
> the affinity, a cursorily inspection shows that there are at least
> expectations in some drivers.
>
> #1 needs to be cleaned up anyway, so that's not a problem
>
> #2 might result in subtle regressions especially when irqbalanced (which
> nowadays ignores the affinity hint) is disabled.
>
> Provide new interfaces:
>
> irq_update_affinity_hint() - Only sets the affinity hint pointer
> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> the interrupt
>
> Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> document it to be phased out.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Applies on:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> ---
> include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> kernel/irq/manage.c | 8 ++++----
> 2 files changed, 44 insertions(+), 5 deletions(-)
>
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> extern int irq_can_set_affinity(unsigned int irq);
> extern int irq_select_affinity(unsigned int irq);
>
> -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity);
> +
> +/**
> + * irq_update_affinity_hint - Update the affinity hint
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint, but does not change the affinity of the interrupt.
> + */
> +static inline int
> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}

Should it be:
return __irq_apply_affinity_hint(irq, m, false);
here?

> +
> +/**
> + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> + * cpumask to the interrupt
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> + * the affinity of that interrupt.
> + */
> +static inline int
> +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/*
> + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> + * instead.
> + */
> +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return irq_apply_affinity_hint(irq, cpumask);
> +}
> +
> extern int irq_update_affinity_desc(unsigned int irq,
> struct irq_affinity_desc *affinity);
>
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> }
> EXPORT_SYMBOL_GPL(irq_force_affinity);
>
> -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> return -EINVAL;
> desc->affinity_hint = m;
> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> + if (m && setaffinity)
> __irq_set_affinity(irq, m, false);
> return 0;
> }
> -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>
> static void irq_affinity_notify(struct work_struct *work)
> {

2021-05-21 20:23:53

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <[email protected]> wrote:
>
> The discussion about removing the side effect of irq_set_affinity_hint() of
> actually applying the cpumask (if not NULL) as affinity to the interrupt,
> unearthed a few unpleasantries:
>
> 1) The modular perf drivers rely on the current behaviour for the very
> wrong reasons.
>
> 2) While none of the other drivers prevents user space from changing
> the affinity, a cursorily inspection shows that there are at least
> expectations in some drivers.
>
> #1 needs to be cleaned up anyway, so that's not a problem
>
> #2 might result in subtle regressions especially when irqbalanced (which
> nowadays ignores the affinity hint) is disabled.
>
> Provide new interfaces:
>
> irq_update_affinity_hint() - Only sets the affinity hint pointer
> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> the interrupt
>

Any reason why you ruled out the usage of irq_set_affinity_and_hint()?
IMHO the latter makes it very clear what the function is meant to do.


> Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> document it to be phased out.

Right, so eventually we will be only left with the following APIs that
the driver will use:
irq_set_affinity()- for drivers that only wants to set the affinity mask
irq_apply_affinity_hint/irq_set_affinity_and_hint() - for drivers that
wants to set same affinity and hint mask
irq_update_affinity_hint() - for drivers that only wants to update the hint mask

Thanks for clearing this.

>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Applies on:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> ---
> include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> kernel/irq/manage.c | 8 ++++----
> 2 files changed, 44 insertions(+), 5 deletions(-)
>
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> extern int irq_can_set_affinity(unsigned int irq);
> extern int irq_select_affinity(unsigned int irq);
>
> -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity);
> +
> +/**
> + * irq_update_affinity_hint - Update the affinity hint
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint, but does not change the affinity of the interrupt.
> + */
> +static inline int
> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/**
> + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> + * cpumask to the interrupt
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> + * the affinity of that interrupt.
> + */
> +static inline int
> +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/*
> + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> + * instead.
> + */
> +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return irq_apply_affinity_hint(irq, cpumask);
> +}
> +
> extern int irq_update_affinity_desc(unsigned int irq,
> struct irq_affinity_desc *affinity);
>
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> }
> EXPORT_SYMBOL_GPL(irq_force_affinity);
>
> -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> return -EINVAL;
> desc->affinity_hint = m;
> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> + if (m && setaffinity)
> __irq_set_affinity(irq, m, false);
> return 0;
> }
> -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>
> static void irq_affinity_notify(struct work_struct *work)
> {
>


--
Nitesh

2021-05-21 21:47:33

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21 2021 at 10:45, Lijun Pan wrote:
> On Fri, May 21, 2021 at 7:48 AM Thomas Gleixner <[email protected]> wrote:
>> +/**
>> + * irq_update_affinity_hint - Update the affinity hint
>> + * @irq: Interrupt to update
>> + * @cpumask: cpumask pointer (NULL to clear the hint)
>> + *
>> + * Updates the affinity hint, but does not change the affinity of the interrupt.
>> + */
>> +static inline int
>> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
>> +{
>> + return __irq_apply_affinity_hint(irq, m, true);
>> +}
>
> Should it be:
> return __irq_apply_affinity_hint(irq, m, false);
> here?

Of course. Copy & Pasta should be forbidden.

Thanks for spotting it!

tglx

2021-05-21 21:50:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21 2021 at 12:13, Nitesh Lal wrote:
> On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <[email protected]> wrote:
>> Provide new interfaces:
>>
>> irq_update_affinity_hint() - Only sets the affinity hint pointer
>> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
>> the interrupt
>>
>
> Any reason why you ruled out the usage of irq_set_affinity_and_hint()?
> IMHO the latter makes it very clear what the function is meant to do.

You're right. I was trying to phase the existing hint setter out, but
that's probably pointless overengineering for no real value. Let me redo
that.

Thanks,

tglx

2021-05-27 10:23:32

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Thu, May 27, 2021 at 06:03:29PM +0800, Shung-Hsi Yu wrote:
> Hi,
>
> On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote:
> > The discussion about removing the side effect of irq_set_affinity_hint() of
> > actually applying the cpumask (if not NULL) as affinity to the interrupt,
> > unearthed a few unpleasantries:
> >
> > 1) The modular perf drivers rely on the current behaviour for the very
> > wrong reasons.
> >
> > 2) While none of the other drivers prevents user space from changing
> > the affinity, a cursorily inspection shows that there are at least
> > expectations in some drivers.
> >
> > #1 needs to be cleaned up anyway, so that's not a problem
> >
> > #2 might result in subtle regressions especially when irqbalanced (which
> > nowadays ignores the affinity hint) is disabled.
> >
> > Provide new interfaces:
> >
> > irq_update_affinity_hint() - Only sets the affinity hint pointer
> > irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> > the interrupt
> >
> > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> > document it to be phased out.
>
> Is there recommended way to retrieve the CPU number that the interrupt has
> affinity?
>
> Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that
> uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU
> number since they're using their own spreading scheme. Now, phasing out
> irq_set_affinity_hint(), and thus relying on request_irq() to spread the
> load instead, there don't seem to be a easy way to get the CPU number.

I should add that the main use-case for retrieving CPU number seems to be
ensuring memory is allocated on the same NUMA node that serves the interrupt
(again, looking at ena driver only, haven't check others yet).

int cpu = i % num_online_cpu();
cpumask_set_cpu(cpu, &affinity_hint_mask);
request_irq(irq, ...);
irq_set_affinity_hint(irq, &affinity_hint_mask);
int node = cpu_to_node(cpu);
buffer = vzalloc(node);

> In theory the following could work, but including irq.h does not look like a
> good idea given that the comment in its explicitly ask not to be included in
> generic code.
>
> #include <linux/irq.h>
> int irq = request_irq(...);
> struct irq_data *data = irq_get_irq_data(irq);
> struct cpumask *mask = irq_data_get_effective_affinity_mask(data);
> int cpu = cpumask_first(mask);
>
> Any suggestions?
>
>
> Thanks,
> Shung-Hsi
>
> > Signed-off-by: Thomas Gleixner <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> > ---
> > Applies on:
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> > ---
> > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> > kernel/irq/manage.c | 8 ++++----
> > 2 files changed, 44 insertions(+), 5 deletions(-)
> >
> > --- a/include/linux/interrupt.h
> > +++ b/include/linux/interrupt.h
> > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> > extern int irq_can_set_affinity(unsigned int irq);
> > extern int irq_select_affinity(unsigned int irq);
> >
> > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> > + bool setaffinity);
> > +
> > +/**
> > + * irq_update_affinity_hint - Update the affinity hint
> > + * @irq: Interrupt to update
> > + * @cpumask: cpumask pointer (NULL to clear the hint)
> > + *
> > + * Updates the affinity hint, but does not change the affinity of the interrupt.
> > + */
> > +static inline int
> > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return __irq_apply_affinity_hint(irq, m, true);
> > +}
> > +
> > +/**
> > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> > + * cpumask to the interrupt
> > + * @irq: Interrupt to update
> > + * @cpumask: cpumask pointer (NULL to clear the hint)
> > + *
> > + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> > + * the affinity of that interrupt.
> > + */
> > +static inline int
> > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return __irq_apply_affinity_hint(irq, m, true);
> > +}
> > +
> > +/*
> > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> > + * instead.
> > + */
> > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return irq_apply_affinity_hint(irq, cpumask);
> > +}
> > +
> > extern int irq_update_affinity_desc(unsigned int irq,
> > struct irq_affinity_desc *affinity);
> >
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> > }
> > EXPORT_SYMBOL_GPL(irq_force_affinity);
> >
> > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> > + bool setaffinity)
> > {
> > unsigned long flags;
> > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> > return -EINVAL;
> > desc->affinity_hint = m;
> > irq_put_desc_unlock(desc, flags);
> > - /* set the initial affinity to prevent every interrupt being on CPU0 */
> > - if (m)
> > + if (m && setaffinity)
> > __irq_set_affinity(irq, m, false);
> > return 0;
> > }
> > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
> >
> > static void irq_affinity_notify(struct work_struct *work)
> > {

2021-05-27 15:44:07

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

Hi,

On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote:
> The discussion about removing the side effect of irq_set_affinity_hint() of
> actually applying the cpumask (if not NULL) as affinity to the interrupt,
> unearthed a few unpleasantries:
>
> 1) The modular perf drivers rely on the current behaviour for the very
> wrong reasons.
>
> 2) While none of the other drivers prevents user space from changing
> the affinity, a cursorily inspection shows that there are at least
> expectations in some drivers.
>
> #1 needs to be cleaned up anyway, so that's not a problem
>
> #2 might result in subtle regressions especially when irqbalanced (which
> nowadays ignores the affinity hint) is disabled.
>
> Provide new interfaces:
>
> irq_update_affinity_hint() - Only sets the affinity hint pointer
> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> the interrupt
>
> Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> document it to be phased out.

Is there recommended way to retrieve the CPU number that the interrupt has
affinity?

Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that
uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU
number since they're using their own spreading scheme. Now, phasing out
irq_set_affinity_hint(), and thus relying on request_irq() to spread the
load instead, there don't seem to be a easy way to get the CPU number.

In theory the following could work, but including irq.h does not look like a
good idea given that the comment in its explicitly ask not to be included in
generic code.

#include <linux/irq.h>
int irq = request_irq(...);
struct irq_data *data = irq_get_irq_data(irq);
struct cpumask *mask = irq_data_get_effective_affinity_mask(data);
int cpu = cpumask_first(mask);

Any suggestions?


Thanks,
Shung-Hsi

> Signed-off-by: Thomas Gleixner <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Applies on:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> ---
> include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> kernel/irq/manage.c | 8 ++++----
> 2 files changed, 44 insertions(+), 5 deletions(-)
>
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> extern int irq_can_set_affinity(unsigned int irq);
> extern int irq_select_affinity(unsigned int irq);
>
> -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity);
> +
> +/**
> + * irq_update_affinity_hint - Update the affinity hint
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint, but does not change the affinity of the interrupt.
> + */
> +static inline int
> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/**
> + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> + * cpumask to the interrupt
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> + * the affinity of that interrupt.
> + */
> +static inline int
> +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/*
> + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> + * instead.
> + */
> +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return irq_apply_affinity_hint(irq, cpumask);
> +}
> +
> extern int irq_update_affinity_desc(unsigned int irq,
> struct irq_affinity_desc *affinity);
>
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> }
> EXPORT_SYMBOL_GPL(irq_force_affinity);
>
> -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> return -EINVAL;
> desc->affinity_hint = m;
> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> + if (m && setaffinity)
> __irq_set_affinity(irq, m, false);
> return 0;
> }
> -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>
> static void irq_affinity_notify(struct work_struct *work)
> {

2021-05-27 16:15:50

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Thu, May 27, 2021 at 6:03 AM Shung-Hsi Yu <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote:
> > The discussion about removing the side effect of irq_set_affinity_hint() of
> > actually applying the cpumask (if not NULL) as affinity to the interrupt,
> > unearthed a few unpleasantries:
> >
> > 1) The modular perf drivers rely on the current behaviour for the very
> > wrong reasons.
> >
> > 2) While none of the other drivers prevents user space from changing
> > the affinity, a cursorily inspection shows that there are at least
> > expectations in some drivers.
> >
> > #1 needs to be cleaned up anyway, so that's not a problem
> >
> > #2 might result in subtle regressions especially when irqbalanced (which
> > nowadays ignores the affinity hint) is disabled.
> >
> > Provide new interfaces:
> >
> > irq_update_affinity_hint() - Only sets the affinity hint pointer
> > irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> > the interrupt
> >
> > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> > document it to be phased out.
>
> Is there recommended way to retrieve the CPU number that the interrupt has
> affinity?
>
> Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that
> uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU
> number since they're using their own spreading scheme. Now, phasing out
> irq_set_affinity_hint(), and thus relying on request_irq() to spread the
> load instead, there don't seem to be a easy way to get the CPU number.
>

For drivers that don't want to rely on request_irq for spreading and want
to force their own affinity mask can use irq_set_affinity() which is an
exported interface now [1] and clearly indicates the purpose of the usage.

As Thomas suggested we are still keeping irq_set_affinity_hint() as a
wrapper until we make appropriate changes in individual drivers that use
this API for different reasons. Please feel free to send out a patch
for this driver once the changes are merged.

[1] https://lkml.org/lkml/2021/5/18/271

--
Thanks
Nitesh

2021-05-28 07:42:14

by Shung-Hsi Yu

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Thu, May 27, 2021 at 09:06:04AM -0400, Nitesh Lal wrote:
> On Thu, May 27, 2021 at 6:03 AM Shung-Hsi Yu <[email protected]> wrote:
> >
> > Hi,
> >
> > On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote:
> > > The discussion about removing the side effect of irq_set_affinity_hint() of
> > > actually applying the cpumask (if not NULL) as affinity to the interrupt,
> > > unearthed a few unpleasantries:
> > >
> > > 1) The modular perf drivers rely on the current behaviour for the very
> > > wrong reasons.
> > >
> > > 2) While none of the other drivers prevents user space from changing
> > > the affinity, a cursorily inspection shows that there are at least
> > > expectations in some drivers.
> > >
> > > #1 needs to be cleaned up anyway, so that's not a problem
> > >
> > > #2 might result in subtle regressions especially when irqbalanced (which
> > > nowadays ignores the affinity hint) is disabled.
> > >
> > > Provide new interfaces:
> > >
> > > irq_update_affinity_hint() - Only sets the affinity hint pointer
> > > irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> > > the interrupt
> > >
> > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> > > document it to be phased out.
> >
> > Is there recommended way to retrieve the CPU number that the interrupt has
> > affinity?
> >
> > Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that
> > uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU
> > number since they're using their own spreading scheme. Now, phasing out
> > irq_set_affinity_hint(), and thus relying on request_irq() to spread the
> > load instead, there don't seem to be a easy way to get the CPU number.
> >
>
> For drivers that don't want to rely on request_irq for spreading and want
> to force their own affinity mask can use irq_set_affinity()

I *do* want the driver to rely on request_irq() for spreading.

It is retrieving effective affinity after request_irq() call that I can't
seem to figure out.


Thanks,
Shung-Hsi

> which is an exported interface now [1] and clearly indicates the purpose
> of the usage.
>
> As Thomas suggested we are still keeping irq_set_affinity_hint() as a
> wrapper until we make appropriate changes in individual drivers that use
> this API for different reasons. Please feel free to send out a patch
> for this driver once the changes are merged.
>
> [1] https://lkml.org/lkml/2021/5/18/271
>
> --
> Thanks
> Nitesh
>

2021-06-04 20:38:38

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21, 2021 at 5:48 PM Thomas Gleixner <[email protected]> wrote:
>
> On Fri, May 21 2021 at 12:13, Nitesh Lal wrote:
> > On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <[email protected]> wrote:
> >> Provide new interfaces:
> >>
> >> irq_update_affinity_hint() - Only sets the affinity hint pointer
> >> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> >> the interrupt
> >>
> >
> > Any reason why you ruled out the usage of irq_set_affinity_and_hint()?
> > IMHO the latter makes it very clear what the function is meant to do.
>
> You're right. I was trying to phase the existing hint setter out, but
> that's probably pointless overengineering for no real value. Let me redo
> that.
>

Thomas, are you planning to send a v2 for this soon or did I somehow miss
it?

Since your other patch "genirq: Export affinity setter for modules" is
already in linux-next, I have started looking into the drivers where
we can use that.

On thinking about this whole chunk a little more, I do wonder about the
reason why we are still sticking with the hints.

The two reasons that I could come up with are:
- We are not entirely sure if irqbalance still consumes this or not
- For future use by some other userspace daemon (?)

Does that sound right?

--
Thanks
Nitesh

2021-06-07 17:04:05

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <[email protected]> wrote:
>
> The discussion about removing the side effect of irq_set_affinity_hint() of
> actually applying the cpumask (if not NULL) as affinity to the interrupt,
> unearthed a few unpleasantries:
>
> 1) The modular perf drivers rely on the current behaviour for the very
> wrong reasons.
>
> 2) While none of the other drivers prevents user space from changing
> the affinity, a cursorily inspection shows that there are at least
> expectations in some drivers.
>
> #1 needs to be cleaned up anyway, so that's not a problem
>
> #2 might result in subtle regressions especially when irqbalanced (which
> nowadays ignores the affinity hint) is disabled.
>
> Provide new interfaces:
>
> irq_update_affinity_hint() - Only sets the affinity hint pointer
> irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> the interrupt
>
> Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> document it to be phased out.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Applies on:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> ---
> include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> kernel/irq/manage.c | 8 ++++----
> 2 files changed, 44 insertions(+), 5 deletions(-)
>
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> extern int irq_can_set_affinity(unsigned int irq);
> extern int irq_select_affinity(unsigned int irq);
>
> -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity);
> +
> +/**
> + * irq_update_affinity_hint - Update the affinity hint
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint, but does not change the affinity of the interrupt.
> + */
> +static inline int
> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/**
> + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> + * cpumask to the interrupt
> + * @irq: Interrupt to update
> + * @cpumask: cpumask pointer (NULL to clear the hint)
> + *
> + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> + * the affinity of that interrupt.
> + */
> +static inline int
> +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return __irq_apply_affinity_hint(irq, m, true);
> +}
> +
> +/*
> + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> + * instead.
> + */
> +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +{
> + return irq_apply_affinity_hint(irq, cpumask);

Another change required here, the above should be 'm' instead of 'cpumask'.

> +}
> +
> extern int irq_update_affinity_desc(unsigned int irq,
> struct irq_affinity_desc *affinity);
>
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> }
> EXPORT_SYMBOL_GPL(irq_force_affinity);
>
> -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> + bool setaffinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> return -EINVAL;
> desc->affinity_hint = m;
> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> + if (m && setaffinity)
> __irq_set_affinity(irq, m, false);
> return 0;
> }
> -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
>
> static void irq_affinity_notify(struct work_struct *work)
> {
>


--
Thanks
Nitesh

2021-06-14 16:14:48

by Nitesh Lal

[permalink] [raw]
Subject: Re: [PATCH] genirq: Provide new interfaces for affinity hints

On Mon, Jun 7, 2021 at 1:00 PM Nitesh Lal <[email protected]> wrote:
>
> On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <[email protected]> wrote:
> >
> > The discussion about removing the side effect of irq_set_affinity_hint() of
> > actually applying the cpumask (if not NULL) as affinity to the interrupt,
> > unearthed a few unpleasantries:
> >
> > 1) The modular perf drivers rely on the current behaviour for the very
> > wrong reasons.
> >
> > 2) While none of the other drivers prevents user space from changing
> > the affinity, a cursorily inspection shows that there are at least
> > expectations in some drivers.
> >
> > #1 needs to be cleaned up anyway, so that's not a problem
> >
> > #2 might result in subtle regressions especially when irqbalanced (which
> > nowadays ignores the affinity hint) is disabled.
> >
> > Provide new interfaces:
> >
> > irq_update_affinity_hint() - Only sets the affinity hint pointer
> > irq_apply_affinity_hint() - Set the pointer and apply the affinity to
> > the interrupt
> >
> > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
> > document it to be phased out.
> >
> > Signed-off-by: Thomas Gleixner <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> > ---
> > Applies on:
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> > ---
> > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++-
> > kernel/irq/manage.c | 8 ++++----
> > 2 files changed, 44 insertions(+), 5 deletions(-)
> >
> > --- a/include/linux/interrupt.h
> > +++ b/include/linux/interrupt.h
> > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i
> > extern int irq_can_set_affinity(unsigned int irq);
> > extern int irq_select_affinity(unsigned int irq);
> >
> > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
> > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> > + bool setaffinity);
> > +
> > +/**
> > + * irq_update_affinity_hint - Update the affinity hint
> > + * @irq: Interrupt to update
> > + * @cpumask: cpumask pointer (NULL to clear the hint)
> > + *
> > + * Updates the affinity hint, but does not change the affinity of the interrupt.
> > + */
> > +static inline int
> > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return __irq_apply_affinity_hint(irq, m, true);
> > +}
> > +
> > +/**
> > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided
> > + * cpumask to the interrupt
> > + * @irq: Interrupt to update
> > + * @cpumask: cpumask pointer (NULL to clear the hint)
> > + *
> > + * Updates the affinity hint and if @cpumask is not NULL it applies it as
> > + * the affinity of that interrupt.
> > + */
> > +static inline int
> > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return __irq_apply_affinity_hint(irq, m, true);
> > +}
> > +
> > +/*
> > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint()
> > + * instead.
> > + */
> > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +{
> > + return irq_apply_affinity_hint(irq, cpumask);
>
> Another change required here, the above should be 'm' instead of 'cpumask'.

I am going to and make the suggested changes to this patch and will post it
with driver patches.
Please let me know if there are any objections to that.

>
> > +}
> > +
> > extern int irq_update_affinity_desc(unsigned int irq,
> > struct irq_affinity_desc *affinity);
> >
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq,
> > }
> > EXPORT_SYMBOL_GPL(irq_force_affinity);
> >
> > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
> > + bool setaffinity)
> > {
> > unsigned long flags;
> > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
> > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i
> > return -EINVAL;
> > desc->affinity_hint = m;
> > irq_put_desc_unlock(desc, flags);
> > - /* set the initial affinity to prevent every interrupt being on CPU0 */
> > - if (m)
> > + if (m && setaffinity)
> > __irq_set_affinity(irq, m, false);
> > return 0;
> > }
> > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
> > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
> >
> > static void irq_affinity_notify(struct work_struct *work)
> > {
> >
>
>
> --
> Thanks
> Nitesh



--
Thanks
Nitesh

Subject: [tip: irq/core] genirq: Provide new interfaces for affinity hints

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 65c7cdedeb3026fabcc967a7aae2f755ad4d0783
Gitweb: https://git.kernel.org/tip/65c7cdedeb3026fabcc967a7aae2f755ad4d0783
Author: Thomas Gleixner <[email protected]>
AuthorDate: Fri, 03 Sep 2021 11:24:17 -04:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Fri, 10 Dec 2021 20:47:38 +01:00

genirq: Provide new interfaces for affinity hints

The discussion about removing the side effect of irq_set_affinity_hint() of
actually applying the cpumask (if not NULL) as affinity to the interrupt,
unearthed a few unpleasantries:

1) The modular perf drivers rely on the current behaviour for the very
wrong reasons.

2) While none of the other drivers prevents user space from changing
the affinity, a cursorily inspection shows that there are at least
expectations in some drivers.

#1 needs to be cleaned up anyway, so that's not a problem

#2 might result in subtle regressions especially when irqbalanced (which
nowadays ignores the affinity hint) is disabled.

Provide new interfaces:

irq_update_affinity_hint() - Only sets the affinity hint pointer
irq_set_affinity_and_hint() - Set the pointer and apply the affinity to
the interrupt

Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
document it to be phased out.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Nitesh Narayan Lal <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/interrupt.h | 53 +++++++++++++++++++++++++++++++++++++-
kernel/irq/manage.c | 8 +++---
2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 1f22a30..9367f1c 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -329,7 +329,46 @@ extern int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask);
extern int irq_can_set_affinity(unsigned int irq);
extern int irq_select_affinity(unsigned int irq);

-extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
+extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity);
+
+/**
+ * irq_update_affinity_hint - Update the affinity hint
+ * @irq: Interrupt to update
+ * @m: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint, but does not change the affinity of the interrupt.
+ */
+static inline int
+irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, false);
+}
+
+/**
+ * irq_set_affinity_and_hint - Update the affinity hint and apply the provided
+ * cpumask to the interrupt
+ * @irq: Interrupt to update
+ * @m: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint and if @m is not NULL it applies it as the
+ * affinity of that interrupt.
+ */
+static inline int
+irq_set_affinity_and_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, true);
+}
+
+/*
+ * Deprecated. Use irq_update_affinity_hint() or irq_set_affinity_and_hint()
+ * instead.
+ */
+static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return irq_set_affinity_and_hint(irq, m);
+}
+
extern int irq_update_affinity_desc(unsigned int irq,
struct irq_affinity_desc *affinity);

@@ -361,6 +400,18 @@ static inline int irq_can_set_affinity(unsigned int irq)

static inline int irq_select_affinity(unsigned int irq) { return 0; }

+static inline int irq_update_affinity_hint(unsigned int irq,
+ const struct cpumask *m)
+{
+ return -EINVAL;
+}
+
+static inline int irq_set_affinity_and_hint(unsigned int irq,
+ const struct cpumask *m)
+{
+ return -EINVAL;
+}
+
static inline int irq_set_affinity_hint(unsigned int irq,
const struct cpumask *m)
{
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 7405e38..f23ffd3 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -486,7 +486,8 @@ int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask)
}
EXPORT_SYMBOL_GPL(irq_force_affinity);

-int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
@@ -495,12 +496,11 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
return -EINVAL;
desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
- /* set the initial affinity to prevent every interrupt being on CPU0 */
- if (m)
+ if (m && setaffinity)
__irq_set_affinity(irq, m, false);
return 0;
}
-EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
+EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);

static void irq_affinity_notify(struct work_struct *work)
{