2021-04-09 15:24:57

by Larry Finger

[permalink] [raw]
Subject: Memory leak in ieee80211_rx_napi()

Johannes and all mac80211 gurus,

What setting or lack of, would cause ieee80211_rx_napi() to leak the skb that it
is given? The documentation states that once this call is made, mac80211 owns
this buffer. Does this mean that it will also be freed?

Larry


2021-04-09 19:31:53

by Johannes Berg

[permalink] [raw]
Subject: Re: Memory leak in ieee80211_rx_napi()

Hi Larry,

> What setting or lack of, would cause ieee80211_rx_napi() to leak the skb that it
> is given? The documentation states that once this call is made, mac80211 owns
> this buffer. Does this mean that it will also be freed?

Eventually, yes. But it might go onto a NAPI GRO list, etc. Perhaps it
might even look like it's leaked if it's on such a list if you didn't
implement NAPI properly as polling, but just call ieee80211_rx_napi()
with a non-NULL napi struct pointer.

That said, of course there might be bugs in mac80211 where it actually
leaks the skb.

How are you determining that it's being leaked?

johannes

2021-04-09 21:29:17

by Larry Finger

[permalink] [raw]
Subject: Re: Memory leak in ieee80211_rx_napi()

On 4/9/21 2:31 PM, Johannes Berg wrote:
> Hi Larry,
>
>> What setting or lack of, would cause ieee80211_rx_napi() to leak the skb that it
>> is given? The documentation states that once this call is made, mac80211 owns
>> this buffer. Does this mean that it will also be freed?
>
> Eventually, yes. But it might go onto a NAPI GRO list, etc. Perhaps it
> might even look like it's leaked if it's on such a list if you didn't
> implement NAPI properly as polling, but just call ieee80211_rx_napi()
> with a non-NULL napi struct pointer.
>
> That said, of course there might be bugs in mac80211 where it actually
> leaks the skb.
>
> How are you determining that it's being leaked?
>
> johannes
>

I use kmemleak. They are real leaks as they persist after the rtw drivers are
unloaded.

The call is ieee80211_rx_napi(rtwdev->hw, NULL, new, napi); I added a test for
napi == NULL. None failed.

Is it possible that the NULL for struct ieee80211_sta would screw it up? I see
that most of the Intel drivers use a non-NULL argument.

Would a "Network controller [0280]: Intel Corporation Wireless 7260 [8086:08b1]
(rev 73)" use napi? If so, iwlmvm does not leak anything. I do not have any
other devices that use napi.

Larry



2021-04-11 21:14:14

by Larry Finger

[permalink] [raw]
Subject: Re: Memory leak in ieee80211_rx_napi()

On 4/9/21 2:31 PM, Johannes Berg wrote:
> Hi Larry,
>
>> What setting or lack of, would cause ieee80211_rx_napi() to leak the skb that it
>> is given? The documentation states that once this call is made, mac80211 owns
>> this buffer. Does this mean that it will also be freed?
>
> Eventually, yes. But it might go onto a NAPI GRO list, etc. Perhaps it
> might even look like it's leaked if it's on such a list if you didn't
> implement NAPI properly as polling, but just call ieee80211_rx_napi()
> with a non-NULL napi struct pointer.

Hi Johannes,

There were two bugs in rtw88. The first, suggested by PK, was that the sequence
between start/stop of NAPI and the enable/disable of interrupts were reversed.
The second bug was in NAPI polling as you suggested. The poll routine was
calling napi_schedule() rather than napi_reschedule().

With these two changes, my RTL8822CE handled 12 hours of flood ping with my
router without leaking a single buffer.

Thanks for your help,

Larry

2021-04-12 11:55:52

by Johannes Berg

[permalink] [raw]
Subject: Re: Memory leak in ieee80211_rx_napi()

Hi Larry,

Sorry I didn't respond to your other email on Friday - it was close to
midnight here and I couldn't pay much attention over the weekend.

> There were two bugs in rtw88. The first, suggested by PK, was that the sequence
> between start/stop of NAPI and the enable/disable of interrupts were reversed.
> The second bug was in NAPI polling as you suggested. The poll routine was
> calling napi_schedule() rather than napi_reschedule().
>
> With these two changes, my RTL8822CE handled 12 hours of flood ping with my
> router without leaking a single buffer.

Glad you found the issues, and thanks for following up!

I do wonder where the SKBs were actually leaked, that's a bit strange,
but I guess it doesn't matter much now.

johannes