2014-05-07 12:06:48

by James Cameron

[permalink] [raw]
Subject: mwifiex: no wr_port available

G'day,

Is there a short duration leak of wr_port in
mwifiex_sdio_host_to_card()?

If mwifiex_host_to_card_mp_aggr fails because of is_suspended in
mwifiex_write_data_sync, is the wr_port ever released?

Or maybe mp_wr_bitmap is re-read from the card when a download
interrupt occurs, or if an upload interrupt occurs after a command is
sent.

--
James Cameron
http://quozl.linux.org.au/


2014-05-14 02:53:06

by Bing Zhao

[permalink] [raw]
Subject: RE: mwifiex: no wr_port available

Hi James,

> G'day,
>
> Is there a short duration leak of wr_port in
> mwifiex_sdio_host_to_card()?

You are right. There is a leak in the error case. Thanks for pointing it out.

>
> If mwifiex_host_to_card_mp_aggr fails because of is_suspended in
> mwifiex_write_data_sync, is the wr_port ever released?
>
> Or maybe mp_wr_bitmap is re-read from the card when a download
> interrupt occurs, or if an upload interrupt occurs after a command is
> sent.

A patch has been sent to address this issue.

Thanks,
Bing


2014-05-14 03:18:16

by James Cameron

[permalink] [raw]
Subject: Re: mwifiex: no wr_port available

On Tue, May 13, 2014 at 07:52:47PM -0700, Bing Zhao wrote:
> Hi James,
>
> > G'day,
> >
> > Is there a short duration leak of wr_port in
> > mwifiex_sdio_host_to_card()?
>
> You are right. There is a leak in the error case. Thanks for
> pointing it out.

Thanks.

> > If mwifiex_host_to_card_mp_aggr fails because of is_suspended in
> > mwifiex_write_data_sync, is the wr_port ever released?
> >
> > Or maybe mp_wr_bitmap is re-read from the card when a download
> > interrupt occurs, or if an upload interrupt occurs after a command is
> > sent.
>
> A patch has been sent to address this issue.

Reviewed, thanks. Functionally identical to a patch that I've been
testing locally:

/*
* Rollback the effects of _get_wr_port above.
*/
static void mwifiex_put_wr_port(struct mwifiex_adapter *adapter, u8 port)
{
struct sdio_mmc_card *card = adapter->card;

card->mp_wr_bitmap |= (1 << port);
card->curr_wr_port = port;
}

...

if (type == MWIFIEX_TYPE_DATA) {
mwifiex_put_wr_port(adapter, port);
adapter->data_sent = false;
}

You may be interested that rd_port also has possible leak when skb
cannot be allocated. See mwifiex_process_int_status:

skb = dev_alloc_skb(rx_len);
if (!skb)
return -1;

We have a hack in OLPC production arm-3.5 kernel which restores
rd_port bit, restores int_status.

http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5&id=59fcaf10cce5bbdc370ec1c262b12aeb66ed1dca

--
James Cameron
http://quozl.linux.org.au/

2014-05-14 04:27:19

by James Cameron

[permalink] [raw]
Subject: Re: mwifiex: no wr_port available

On Tue, May 13, 2014 at 08:54:53PM -0700, Bing Zhao wrote:
> Hi James,
> [...]
> > You may be interested that rd_port also has possible leak when skb
> > cannot be allocated. See mwifiex_process_int_status:
> >
> > skb = dev_alloc_skb(rx_len);
> > if (!skb)
> > return -1;
> >
> > We have a hack in OLPC production arm-3.5 kernel which restores
> > rd_port bit, restores int_status.
> >
> > http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5&id=59fcaf10cce5bbdc370ec1c262b12aeb66ed1dca
>
> The rd_port/int_status change looks correct. But what issue did you
> encounter prior to applying the change "cache the read bitmap at the
> time of the interrupt rather than when the interrupt status bits are
> handled"?

No specific issue, instead ensuring that all effects are rolled back.
The effects of a call to mwifiex_process_int_status that returned -1
due to dev_alloc_skb fail were:

- change to curr_rd_port,

- bits cleared in mp_rd_bitmap,

So both effects were rolled back. Perhaps not needed?

I'm not recommending this as the right solution, because the patch
caused CPU loop during memory exhaustion, the interrupt came
back immediately. I added a hack to reduce impact of that:

http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5&id=491188ebd4721f3c8804b7fa626e63d6fdd2ed09

I did not have the experience to find out _why_ we could not
dev_alloc_skb, but at least was able to let the driver continue rather
than timeout.

--
James Cameron
http://quozl.linux.org.au/

2014-05-14 03:55:13

by Bing Zhao

[permalink] [raw]
Subject: RE: mwifiex: no wr_port available

Hi James,

> > > If mwifiex_host_to_card_mp_aggr fails because of is_suspended in
> > > mwifiex_write_data_sync, is the wr_port ever released?
> > >
> > > Or maybe mp_wr_bitmap is re-read from the card when a download
> > > interrupt occurs, or if an upload interrupt occurs after a command is
> > > sent.
> >
> > A patch has been sent to address this issue.
>
> Reviewed, thanks. Functionally identical to a patch that I've been
> testing locally:

Thanks!

>
> /*
> * Rollback the effects of _get_wr_port above.
> */
> static void mwifiex_put_wr_port(struct mwifiex_adapter *adapter, u8 port)
> {
> struct sdio_mmc_card *card = adapter->card;
>
> card->mp_wr_bitmap |= (1 << port);
> card->curr_wr_port = port;
> }

Yeah, that's the same fix.

>
> ...
>
> if (type == MWIFIEX_TYPE_DATA) {
> mwifiex_put_wr_port(adapter, port);
> adapter->data_sent = false;
> }
>
> You may be interested that rd_port also has possible leak when skb
> cannot be allocated. See mwifiex_process_int_status:
>
> skb = dev_alloc_skb(rx_len);
> if (!skb)
> return -1;
>
> We have a hack in OLPC production arm-3.5 kernel which restores
> rd_port bit, restores int_status.
>
> http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5&id=59fcaf10cce5bbdc370ec1c262b12aeb66ed1dca

The rd_port/int_status change looks correct. But what issue did you encounter prior to applying the change "cache the read bitmap at the time of the interrupt rather than when the interrupt status bits are handled"?

Thanks,
Bing


2014-05-14 04:55:30

by Bing Zhao

[permalink] [raw]
Subject: RE: mwifiex: no wr_port available

Hi James,

> > > You may be interested that rd_port also has possible leak when skb
> > > cannot be allocated. See mwifiex_process_int_status:
> > >
> > > skb = dev_alloc_skb(rx_len);
> > > if (!skb)
> > > return -1;
> > >
> > > We have a hack in OLPC production arm-3.5 kernel which restores
> > > rd_port bit, restores int_status.
> > >
> > > http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-
> 3.5&id=59fcaf10cce5bbdc370ec1c262b12aeb66ed1dca
> >
> > The rd_port/int_status change looks correct. But what issue did you
> > encounter prior to applying the change "cache the read bitmap at the
> > time of the interrupt rather than when the interrupt status bits are
> > handled"?
>
> No specific issue, instead ensuring that all effects are rolled back.
> The effects of a call to mwifiex_process_int_status that returned -1
> due to dev_alloc_skb fail were:
>
> - change to curr_rd_port,
>
> - bits cleared in mp_rd_bitmap,
>
> So both effects were rolled back. Perhaps not needed?

It's probably not needed. Anyway you don't have to remove it as it doesn't seem to hurt.

>
> I'm not recommending this as the right solution, because the patch
> caused CPU loop during memory exhaustion, the interrupt came
> back immediately. I added a hack to reduce impact of that:
>
> http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5&id=491188ebd4721f3c8804b7fa626e63d6fdd2ed09
>
> I did not have the experience to find out _why_ we could not
> dev_alloc_skb, but at least was able to let the driver continue rather
> than timeout.

On some system, once dev_alloc_skb fails it may always fail no matter how many times you try unless you reboot.
So we should allow the failure of dev_alloc_skb for that reason. If your system works this way it's fine. But you can probably consider put a limit on dev_alloc_skb attempts.

Thanks,
Bing