2015-04-08 04:52:58

by Seiichi Ikarashi

[permalink] [raw]
Subject: [PATCH] irq: Remove unnecessary warning with affinity_hint

Hi,

If you turn off a PCI device whose driver has set affinity_hint,
you will get warning message which does _not_ explain the reason
why it appeared from the user's point of view.

# echo 0 > /sys/bus/pci/slots/65/power

Apr 28 20:29:39 localhost kernel: ------------[ cut here ]------------
Apr 28 20:29:39 localhost kernel: WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250() (Tainted: P --------------- )
(snip)

Users will misunderstand some problem has happened
even though he or she succeeded to turn off the device.
I suppose this warning was originally for a debug purpose
for driver developers and has incidentally been left.

Just remove the warning is good and enough.

Signed-off-by: Seiichi Ikarashi <[email protected]>

--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1335,7 +1335,7 @@ static struct irqaction *__free_irq(unsi

#ifdef CONFIG_SMP
/* make sure affinity_hint is cleaned up */
- if (WARN_ON_ONCE(desc->affinity_hint))
+ if (desc->affinity_hint)
desc->affinity_hint = NULL;
#endif



2015-04-08 06:28:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] irq: Remove unnecessary warning with affinity_hint


* Seiichi Ikarashi <[email protected]> wrote:

> Hi,
>
> If you turn off a PCI device whose driver has set affinity_hint,
> you will get warning message which does _not_ explain the reason
> why it appeared from the user's point of view.
>
> # echo 0 > /sys/bus/pci/slots/65/power
>
> Apr 28 20:29:39 localhost kernel: ------------[ cut here ]------------
> Apr 28 20:29:39 localhost kernel: WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250() (Tainted: P --------------- )
> (snip)
>
> Users will misunderstand some problem has happened
> even though he or she succeeded to turn off the device.
> I suppose this warning was originally for a debug purpose
> for driver developers and has incidentally been left.
>
> Just remove the warning is good and enough.
>
> Signed-off-by: Seiichi Ikarashi <[email protected]>
>
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1335,7 +1335,7 @@ static struct irqaction *__free_irq(unsi
>
> #ifdef CONFIG_SMP
> /* make sure affinity_hint is cleaned up */
> - if (WARN_ON_ONCE(desc->affinity_hint))
> + if (desc->affinity_hint)
> desc->affinity_hint = NULL;

Well, drivers that are using irq_set_affinity_hint() are expected to
call:

irq_set_affinity_hint(irq, NULL);

to clear the affinity mask, before releasing the irq. This warning
flags drivers that forgot to do that and which might thus leak a
dynamically allocated CPU mask (and/or other resources).

Feel free to turn the warning message into a more informative WARN()
that will blame the driver that triggered it, if the stack dump into
the driver wasn't a clue enough ...

Thanks,

Ingo

2015-04-08 07:33:19

by Seiichi Ikarashi

[permalink] [raw]
Subject: Re: [PATCH] irq: Remove unnecessary warning with affinity_hint

Hi,

On 2015-04-08 15:28, Ingo Molnar wrote:
>
> * Seiichi Ikarashi <[email protected]> wrote:
>
>> Hi,
>>
>> If you turn off a PCI device whose driver has set affinity_hint,
>> you will get warning message which does _not_ explain the reason
>> why it appeared from the user's point of view.
>>
>> # echo 0 > /sys/bus/pci/slots/65/power
>>
>> Apr 28 20:29:39 localhost kernel: ------------[ cut here ]------------
>> Apr 28 20:29:39 localhost kernel: WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250() (Tainted: P --------------- )
>> (snip)
>>
>> Users will misunderstand some problem has happened
>> even though he or she succeeded to turn off the device.
>> I suppose this warning was originally for a debug purpose
>> for driver developers and has incidentally been left.
>>
>> Just remove the warning is good and enough.
>>
>> Signed-off-by: Seiichi Ikarashi <[email protected]>
>>
>> --- a/kernel/irq/manage.c
>> +++ b/kernel/irq/manage.c
>> @@ -1335,7 +1335,7 @@ static struct irqaction *__free_irq(unsi
>>
>> #ifdef CONFIG_SMP
>> /* make sure affinity_hint is cleaned up */
>> - if (WARN_ON_ONCE(desc->affinity_hint))
>> + if (desc->affinity_hint)
>> desc->affinity_hint = NULL;
>
> Well, drivers that are using irq_set_affinity_hint() are expected to
> call:
>
> irq_set_affinity_hint(irq, NULL);
>
> to clear the affinity mask, before releasing the irq. This warning
> flags drivers that forgot to do that and which might thus leak a
> dynamically allocated CPU mask (and/or other resources).

Calling irq_set_affinity_hint(irq, NULL) does not guarantee that
the driver does not forget to deallocate a dynamically allocated
CPU mask and/or other resources. But if calling it with NULL 2nd-arg
before releasing the irq is a virtual rule of using irq_set_affinity_hint()
interface, I understand it.

>
> Feel free to turn the warning message into a more informative WARN()
> that will blame the driver that triggered it, if the stack dump into
> the driver wasn't a clue enough ...

Still, I do not know leaving the warning message is effective to
prevent drivers from potentially leaking resource... considering
a kind of cost-effectivenss. Business users (not developers) hate
such kind of messages for developers.

Thanks,
Seiichi

2015-04-08 07:39:43

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] irq: Remove unnecessary warning with affinity_hint


* Seiichi Ikarashi <[email protected]> wrote:

> Hi,
>
> On 2015-04-08 15:28, Ingo Molnar wrote:
> >
> > * Seiichi Ikarashi <[email protected]> wrote:
> >
> >> Hi,
> >>
> >> If you turn off a PCI device whose driver has set affinity_hint,
> >> you will get warning message which does _not_ explain the reason
> >> why it appeared from the user's point of view.
> >>
> >> # echo 0 > /sys/bus/pci/slots/65/power
> >>
> >> Apr 28 20:29:39 localhost kernel: ------------[ cut here ]------------
> >> Apr 28 20:29:39 localhost kernel: WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250() (Tainted: P --------------- )
> >> (snip)
> >>
> >> Users will misunderstand some problem has happened
> >> even though he or she succeeded to turn off the device.
> >> I suppose this warning was originally for a debug purpose
> >> for driver developers and has incidentally been left.
> >>
> >> Just remove the warning is good and enough.
> >>
> >> Signed-off-by: Seiichi Ikarashi <[email protected]>
> >>
> >> --- a/kernel/irq/manage.c
> >> +++ b/kernel/irq/manage.c
> >> @@ -1335,7 +1335,7 @@ static struct irqaction *__free_irq(unsi
> >>
> >> #ifdef CONFIG_SMP
> >> /* make sure affinity_hint is cleaned up */
> >> - if (WARN_ON_ONCE(desc->affinity_hint))
> >> + if (desc->affinity_hint)
> >> desc->affinity_hint = NULL;
> >
> > Well, drivers that are using irq_set_affinity_hint() are expected to
> > call:
> >
> > irq_set_affinity_hint(irq, NULL);
> >
> > to clear the affinity mask, before releasing the irq. This warning
> > flags drivers that forgot to do that and which might thus leak a
> > dynamically allocated CPU mask (and/or other resources).
>
> Calling irq_set_affinity_hint(irq, NULL) does not guarantee that the
> driver does not forget to deallocate a dynamically allocated CPU
> mask and/or other resources. [...]

I said 'might leak', not 'guaranteed to leak'.

Calling irq_set_affinity_hint(irq, NULL) is the way this kernel API is
specified to be used. Forgetting to do it is a kernel driver bug and
triggers a warning message from the kernel's IRQ subsystem.

> [...] But if calling it with NULL 2nd-arg before releasing the irq
> is a virtual rule of using irq_set_affinity_hint() interface, I
> understand it.
>
> > Feel free to turn the warning message into a more informative
> > WARN() that will blame the driver that triggered it, if the stack
> > dump into the driver wasn't a clue enough ...
>
> Still, I do not know leaving the warning message is effective to
> prevent drivers from potentially leaking resource... considering a
> kind of cost-effectivenss. Business users (not developers) hate such
> kind of messages for developers.

it's a warning message pointing out a kernel bug: that
irq_set_affinity_hint(irq, NULL) was not called properly.

Messages pointing out kernel bugs should be fixed, not hidden.

Thanks,

Ingo

2015-04-08 08:06:47

by Seiichi Ikarashi

[permalink] [raw]
Subject: Re: [PATCH] irq: Remove unnecessary warning with affinity_hint

On 2015-04-08 16:39, Ingo Molnar wrote:
>
> * Seiichi Ikarashi <[email protected]> wrote:
>
>> Hi,
>>
>> On 2015-04-08 15:28, Ingo Molnar wrote:
>>>
>>> * Seiichi Ikarashi <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> If you turn off a PCI device whose driver has set affinity_hint,
>>>> you will get warning message which does _not_ explain the reason
>>>> why it appeared from the user's point of view.
>>>>
>>>> # echo 0 > /sys/bus/pci/slots/65/power
>>>>
>>>> Apr 28 20:29:39 localhost kernel: ------------[ cut here ]------------
>>>> Apr 28 20:29:39 localhost kernel: WARNING: at kernel/irq/manage.c:1002 __free_irq+0x22d/0x250() (Tainted: P --------------- )
>>>> (snip)
>>>>
>>>> Users will misunderstand some problem has happened
>>>> even though he or she succeeded to turn off the device.
>>>> I suppose this warning was originally for a debug purpose
>>>> for driver developers and has incidentally been left.
>>>>
>>>> Just remove the warning is good and enough.
>>>>
>>>> Signed-off-by: Seiichi Ikarashi <[email protected]>
>>>>
>>>> --- a/kernel/irq/manage.c
>>>> +++ b/kernel/irq/manage.c
>>>> @@ -1335,7 +1335,7 @@ static struct irqaction *__free_irq(unsi
>>>>
>>>> #ifdef CONFIG_SMP
>>>> /* make sure affinity_hint is cleaned up */
>>>> - if (WARN_ON_ONCE(desc->affinity_hint))
>>>> + if (desc->affinity_hint)
>>>> desc->affinity_hint = NULL;
>>>
>>> Well, drivers that are using irq_set_affinity_hint() are expected to
>>> call:
>>>
>>> irq_set_affinity_hint(irq, NULL);
>>>
>>> to clear the affinity mask, before releasing the irq. This warning
>>> flags drivers that forgot to do that and which might thus leak a
>>> dynamically allocated CPU mask (and/or other resources).
>>
>> Calling irq_set_affinity_hint(irq, NULL) does not guarantee that the
>> driver does not forget to deallocate a dynamically allocated CPU
>> mask and/or other resources. [...]
>
> I said 'might leak', not 'guaranteed to leak'.

Yes, I know.
I wrote it because I was not sure about the primary purpose of
showing the warning message.

>
> Calling irq_set_affinity_hint(irq, NULL) is the way this kernel API is
> specified to be used. Forgetting to do it is a kernel driver bug and
> triggers a warning message from the kernel's IRQ subsystem.
>
>> [...] But if calling it with NULL 2nd-arg before releasing the irq
>> is a virtual rule of using irq_set_affinity_hint() interface, I
>> understand it.
>>
>>> Feel free to turn the warning message into a more informative
>>> WARN() that will blame the driver that triggered it, if the stack
>>> dump into the driver wasn't a clue enough ...
>>
>> Still, I do not know leaving the warning message is effective to
>> prevent drivers from potentially leaking resource... considering a
>> kind of cost-effectivenss. Business users (not developers) hate such
>> kind of messages for developers.
>
> it's a warning message pointing out a kernel bug: that
> irq_set_affinity_hint(irq, NULL) was not called properly.
>
> Messages pointing out kernel bugs should be fixed, not hidden.

OK, the conclusion is that a kernel bug on using irq_set_affinity_hint().

Thanks, Ingo.

Seiichi