2020-07-23 10:59:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [Softirq] a76eadba0d: WARNING:at_net/mac80211/rx.c:#ieee80211_rx_napi[mac80211]

kernel test robot <[email protected]> writes:
> [ 106.856151] WARNING: CPU: 5 PID: 4569 at net/mac80211/rx.c:4708 ieee80211_rx_napi+0x44d/0x560 [mac80211]

Bah. I clearly should have noticed when looking at the patch.

pending = softirq_pending();

set_softirq_pending(0);

while (pending) {
....

if (timeout)
break;
}

That drops everything which has not yet been processed and the above
warning is due to this.

Thanks,

tglx


2020-07-23 13:58:39

by jun qian

[permalink] [raw]
Subject: Re: [Softirq] a76eadba0d: WARNING:at_net/mac80211/rx.c:#ieee80211_rx_napi[mac80211]

On Thu, Jul 23, 2020 at 6:58 PM Thomas Gleixner <[email protected]> wrote:
>
> kernel test robot <[email protected]> writes:
> > [ 106.856151] WARNING: CPU: 5 PID: 4569 at net/mac80211/rx.c:4708 ieee80211_rx_napi+0x44d/0x560 [mac80211]
>
> Bah. I clearly should have noticed when looking at the patch.
>
> pending = softirq_pending();
>
> set_softirq_pending(0);
>
> while (pending) {
> ....
>
> if (timeout)
> break;
> }
>
> That drops everything which has not yet been processed and the above
> warning is due to this.
>
wow, I made a mistake, thank you for finding the cause of the problem
so quickly.

How about the following code. we need to clear the corresponding
pending bit at the
right time Instead of all the pending bits cleared in the start.

pending = softirq_pending();

while ((softirq_bit = ffs(pending))) {

pending >>= softirq_bit;

set_softirq_pending(pending); //Only clear the corresponding
bit which will be processed.

h->action(h);

if (timeout)
break;
}

> Thanks,
>
> tglx
>

2020-07-23 14:36:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [Softirq] a76eadba0d: WARNING:at_net/mac80211/rx.c:#ieee80211_rx_napi[mac80211]

jun qian <[email protected]> writes:
> On Thu, Jul 23, 2020 at 6:58 PM Thomas Gleixner <[email protected]> wrote:
>> That drops everything which has not yet been processed and the above
>> warning is due to this.
>>
> wow, I made a mistake, thank you for finding the cause of the problem
> so quickly.
>
> How about the following code. we need to clear the corresponding
> pending bit at the
> right time Instead of all the pending bits cleared in the start.
>
> pending = softirq_pending();
>
> while ((softirq_bit = ffs(pending))) {
>
> pending >>= softirq_bit;
>
> set_softirq_pending(pending); //Only clear the corresponding
> bit which will be processed.

How is that supposed to be correct. pending has been shifted
right. Something like this should work:

h++;
pending >>= softirq_bit;

if (timeout()) {
/*
* Ensure that the remaining pending bits
* are handled.
*/
or_softirq_pending(pending << (vec_nr + 1));
break;
}
}

Thanks,

tglx

2020-07-24 05:36:42

by jun qian

[permalink] [raw]
Subject: Re: [Softirq] a76eadba0d: WARNING:at_net/mac80211/rx.c:#ieee80211_rx_napi[mac80211]

On Thu, Jul 23, 2020 at 10:35 PM Thomas Gleixner <[email protected]> wrote:
>
> jun qian <[email protected]> writes:
> > On Thu, Jul 23, 2020 at 6:58 PM Thomas Gleixner <[email protected]> wrote:
> >> That drops everything which has not yet been processed and the above
> >> warning is due to this.
> >>
> > wow, I made a mistake, thank you for finding the cause of the problem
> > so quickly.
> >
> > How about the following code. we need to clear the corresponding
> > pending bit at the
> > right time Instead of all the pending bits cleared in the start.
> >
> > pending = softirq_pending();
> >
> > while ((softirq_bit = ffs(pending))) {
> >
> > pending >>= softirq_bit;
> >
> > set_softirq_pending(pending); //Only clear the corresponding
> > bit which will be processed.
>
> How is that supposed to be correct. pending has been shifted
> right. Something like this should work:
>
> h++;
> pending >>= softirq_bit;
>
> if (timeout()) {
> /*
> * Ensure that the remaining pending bits
> * are handled.
> */
> or_softirq_pending(pending << (vec_nr + 1));
> break;
> }
> }
>
> Thanks,
>
> tglx
>

I have two questions that need to be discussed.

1. If the __do_sofrirq() is executed in the ksoftirqd, we may not need
to check the timeout in the loop.
2. Both the invoke_softirq() and run_ksoftirqd() will execute
__do_sofirq, they all execute the same code,
when it is in the ksoftirqd, Do we need to wake up ksoftirqd in
the process context according to
max_restart and MAX_SOFTIRQ_TIME. In my opinion, If we use a flag
to distinguish where
__do_softirq() is called from, we can do what is most suitable
for __do_softirq based on this flag.

2020-07-24 23:35:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [Softirq] a76eadba0d: WARNING:at_net/mac80211/rx.c:#ieee80211_rx_napi[mac80211]

jun qian <[email protected]> writes:
> I have two questions that need to be discussed.
>
> 1. If the __do_sofrirq() is executed in the ksoftirqd, we may not need
> to check the timeout in the loop.
> 2. Both the invoke_softirq() and run_ksoftirqd() will execute
> __do_sofirq, they all execute the same code,
> when it is in the ksoftirqd, Do we need to wake up ksoftirqd in
> the process context according to
> max_restart and MAX_SOFTIRQ_TIME. In my opinion, If we use a flag
> to distinguish where
> __do_softirq() is called from, we can do what is most suitable
> for __do_softirq based on this flag.

You answered your questions yourself :)