2016-03-29 04:47:54

by Wei-Ning Huang

[permalink] [raw]
Subject: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

"single skb allocation failure" happens when system is under heavy
memory pressure. Add __GFP_REPEAT to skb allocation call so kernel
attempts to reclaim pages and retry the allocation.

Signed-off-by: Wei-Ning Huang <[email protected]>
---
drivers/net/wireless/marvell/mwifiex/sdio.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index b2c839a..c64989c 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -1124,7 +1124,8 @@ static void mwifiex_deaggr_sdio_pkt(struct mwifiex_adapter *adapter,
break;
}
skb_deaggr = mwifiex_alloc_dma_align_buf(pkt_len,
- GFP_KERNEL | GFP_DMA);
+ GFP_KERNEL | GFP_DMA |
+ __GFP_REPEAT);
if (!skb_deaggr)
break;
skb_put(skb_deaggr, pkt_len);
@@ -1374,7 +1375,8 @@ static int mwifiex_sdio_card_to_host_mp_aggr(struct mwifiex_adapter *adapter,
/* copy pkt to deaggr buf */
skb_deaggr = mwifiex_alloc_dma_align_buf(len_arr[pind],
GFP_KERNEL |
- GFP_DMA);
+ GFP_DMA |
+ __GFP_REPEAT);
if (!skb_deaggr) {
mwifiex_dbg(adapter, ERROR, "skb allocation failure\t"
"drop pkt len=%d type=%d\n",
@@ -1416,7 +1418,8 @@ rx_curr_single:
mwifiex_dbg(adapter, INFO, "info: RX: port: %d, rx_len: %d\n",
port, rx_len);

- skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA);
+ skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA |
+ __GFP_REPEAT);
if (!skb) {
mwifiex_dbg(adapter, ERROR,
"single skb allocated fail,\t"
@@ -1521,7 +1524,8 @@ static int mwifiex_process_int_status(struct mwifiex_adapter *adapter)
rx_len = (u16) (rx_blocks * MWIFIEX_SDIO_BLOCK_SIZE);
mwifiex_dbg(adapter, INFO, "info: rx_len = %d\n", rx_len);

- skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA);
+ skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA |
+ __GFP_REPEAT);
if (!skb)
return -1;

--
2.8.0.rc3.226.g39d4020



2016-03-29 12:58:42

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

On Tue, 2016-03-29 at 17:27 +0800, Wei-Ning Huang wrote:
> Adding some chromium devs to the thread.
>
> In, http://lxr.free-electrons.com/source/mm/page_alloc.c#L3152
>
> The default mm retry allocation when 'order <=
> PAGE_ALLOC_COSTLY_ORDER' of gfp_mask contains __GFP_REPEAT.
> PAGE_ALLOC_COSTLY_ORDER is defined to be 3. On systems with page size
> = 4K, this means memory compaction and retry is only done when the
> size of allocation is <= 32K
> In mwifiex, the allocation size is 64K.



> When we have system with
> memory fragmentation and allocation failed, there will be no retry.
> This is why we need to add __GFP_REPEAT here to allow the system to
> perform memory compaction and retry allocation.
>
> Maybe Amit@marvell can comment on if this is a good fix on this issue.
> I'm also aware that marvell is the progress of implementing
> scatter/gatter for mwifiex, which can also fix the issue.

Before SG is implemented, you really need to copy incoming frames into
smallest chunks (to get lowest skb->truesize) and leave the 64KB
allocated stuff forever in the driver.

__GFP_REPEAT wont really solve the issue.

It seems the problem comes from the fact that the drivers calls
dev_kfree_skb_any() after calling mwifiex_deaggr_sdio_pkt(), instead of
recycling this very precious 64KB skb once memory gets fragmented.

Another problem is that mwifiex_deaggr_sdio_pkt() uses
mwifiex_alloc_dma_align_buf() with GFP_KERNEL | GFP_DMA

Really GFP_DMA makes no sense here, since the skb is going to be
processed by the stack, which has no such requirement.

Please use normal skb allocations there.

diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index b2c839a..8404db5 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -1123,8 +1123,8 @@ static void mwifiex_deaggr_sdio_pkt(struct mwifiex_adapter *adapter,
__func__, pkt_len, blk_size);
break;
}
- skb_deaggr = mwifiex_alloc_dma_align_buf(pkt_len,
- GFP_KERNEL | GFP_DMA);
+ skb_deaggr = __netdev_alloc_skb_ip_align(NULL, pkt_len,
+ GFP_KERNEL);
if (!skb_deaggr)
break;
skb_put(skb_deaggr, pkt_len);





2016-03-29 08:38:03

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

Wei-Ning Huang <[email protected]> writes:

> "single skb allocation failure" happens when system is under heavy
> memory pressure. Add __GFP_REPEAT to skb allocation call so kernel
> attempts to reclaim pages and retry the allocation.
>
> Signed-off-by: Wei-Ning Huang <[email protected]>

Is this really a proper way to fix the issue? This is the first time I'm
hearing about the flag and there isn't even a single user in
drivers/net. I would like to get confirmation from others that
__GFP_REPEAT is really ok to use in a wireless driver before I can take
this.

--
Kalle Valo

2016-03-29 09:27:17

by Wei-Ning Huang

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

Adding some chromium devs to the thread.

In, http://lxr.free-electrons.com/source/mm/page_alloc.c#L3152

The default mm retry allocation when 'order <=
PAGE_ALLOC_COSTLY_ORDER' of gfp_mask contains __GFP_REPEAT.
PAGE_ALLOC_COSTLY_ORDER is defined to be 3. On systems with page size
= 4K, this means memory compaction and retry is only done when the
size of allocation is <= 32K
In mwifiex, the allocation size is 64K. When we have system with
memory fragmentation and allocation failed, there will be no retry.
This is why we need to add __GFP_REPEAT here to allow the system to
perform memory compaction and retry allocation.

Maybe Amit@marvell can comment on if this is a good fix on this issue.
I'm also aware that marvell is the progress of implementing
scatter/gatter for mwifiex, which can also fix the issue.

Wei-Ning

On Tue, Mar 29, 2016 at 4:37 PM, Kalle Valo <[email protected]> wrote:
> Wei-Ning Huang <[email protected]> writes:
>
>> "single skb allocation failure" happens when system is under heavy
>> memory pressure. Add __GFP_REPEAT to skb allocation call so kernel
>> attempts to reclaim pages and retry the allocation.
>>
>> Signed-off-by: Wei-Ning Huang <[email protected]>
>
> Is this really a proper way to fix the issue? This is the first time I'm
> hearing about the flag and there isn't even a single user in
> drivers/net. I would like to get confirmation from others that
> __GFP_REPEAT is really ok to use in a wireless driver before I can take
> this.
>
> --
> Kalle Valo



--
Wei-Ning Huang, 黃偉寧 | Software Engineer, Google Inc., Taiwan |
[email protected] | Cell: +886 910-380678

2016-03-29 10:51:59

by Amitkumar Karwar

[permalink] [raw]
Subject: RE: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

PiBGcm9tOiBXZWktTmluZyBIdWFuZyBbbWFpbHRvOnduaHVhbmdAZ29vZ2xlLmNvbV0NCj4gU2Vu
dDogVHVlc2RheSwgTWFyY2ggMjksIDIwMTYgMjo1NyBQTQ0KPiBUbzogS2FsbGUgVmFsbw0KPiBD
YzogTGludXggV2lyZWxlc3M7IExLTUw7IEFtaXRrdW1hciBLYXJ3YXI7IE5pc2hhbnQgU2FybXVr
YWRhbTsgU2FtZWVyDQo+IE5hbmRhOyBuZXRkZXZAdmdlci5rZXJuZWwub3JnOyBTb25ueSBSYW87
IERvdWdsYXMgQW5kZXJzb24NCj4gU3ViamVjdDogUmU6IFtQQVRDSF0gbXdpZmlleDogYWRkIF9f
R0ZQX1JFUEVBVCB0byBza2IgYWxsb2NhdGlvbiBjYWxsDQo+IA0KPiBBZGRpbmcgc29tZSBjaHJv
bWl1bSBkZXZzIHRvIHRoZSB0aHJlYWQuDQo+IA0KPiBJbiwgaHR0cDovL2x4ci5mcmVlLWVsZWN0
cm9ucy5jb20vc291cmNlL21tL3BhZ2VfYWxsb2MuYyNMMzE1Mg0KPiANCj4gVGhlIGRlZmF1bHQg
bW0gcmV0cnkgYWxsb2NhdGlvbiB3aGVuICdvcmRlciA8PSBQQUdFX0FMTE9DX0NPU1RMWV9PUkRF
UicNCj4gb2YgZ2ZwX21hc2sgY29udGFpbnMgX19HRlBfUkVQRUFULg0KPiBQQUdFX0FMTE9DX0NP
U1RMWV9PUkRFUiBpcyBkZWZpbmVkIHRvIGJlIDMuIE9uIHN5c3RlbXMgd2l0aCBwYWdlIHNpemUg
PQ0KPiA0SywgdGhpcyBtZWFucyBtZW1vcnkgY29tcGFjdGlvbiBhbmQgcmV0cnkgaXMgb25seSBk
b25lIHdoZW4gdGhlIHNpemUgb2YNCj4gYWxsb2NhdGlvbiBpcyA8PSAzMksgSW4gbXdpZmlleCwg
dGhlIGFsbG9jYXRpb24gc2l6ZSBpcyA2NEsuIFdoZW4gd2UNCj4gaGF2ZSBzeXN0ZW0gd2l0aCBt
ZW1vcnkgZnJhZ21lbnRhdGlvbiBhbmQgYWxsb2NhdGlvbiBmYWlsZWQsIHRoZXJlIHdpbGwNCj4g
YmUgbm8gcmV0cnkuDQo+IFRoaXMgaXMgd2h5IHdlIG5lZWQgdG8gYWRkIF9fR0ZQX1JFUEVBVCBo
ZXJlIHRvIGFsbG93IHRoZSBzeXN0ZW0gdG8NCj4gcGVyZm9ybSBtZW1vcnkgY29tcGFjdGlvbiBh
bmQgcmV0cnkgYWxsb2NhdGlvbi4NCj4gDQo+IE1heWJlIEFtaXRAbWFydmVsbCBjYW4gY29tbWVu
dCBvbiBpZiB0aGlzIGlzIGEgZ29vZCBmaXggb24gdGhpcyBpc3N1ZS4NCj4gSSdtIGFsc28gYXdh
cmUgdGhhdCBtYXJ2ZWxsIGlzIHRoZSBwcm9ncmVzcyBvZiBpbXBsZW1lbnRpbmcNCj4gc2NhdHRl
ci9nYXR0ZXIgZm9yIG13aWZpZXgsIHdoaWNoIGNhbiBhbHNvIGZpeCB0aGUgaXNzdWUuDQo+IA0K
PiBXZWktTmluZw0KPiANCg0KVGhpcyBmaXggd291bGQgYmUgdXNlZnVsLiBXZSBoYXZlIGEgZmVh
dHVyZSBjYWxsZWQgc2luZ2xlIHBvcnQgYWdncmVnYXRpb24gaW4gd2hpY2ggc29tZXRpbWVzIGRh
dGEgcmVjZWl2ZWQgZnJvbSBTRElPIGludGVyZmFjZSBjYW4gYmUgPjMyayAoYnV0IGxlc3MgdGhh
biA2NGspLiBUaGlzIGZlYXR1cmUgaW1wcm92ZXMgdGhyb3VnaHB1dCBwZXJmb3JtYW5jZS4gV2Ug
YXJlIHByZXBhcmluZyBwYXRjaGVzIGZvciBzY2F0dGVyL2dhdGhlciBmZWF0dXJlLiBidXQgc2Nh
dHRlci9nYXRoZXIgd29uJ3QgYmUgc3VwcG9ydGVkIGJ5IHNvbWUgcGxhdGZvcm1zLiBIZW5jZSB0
aGlzIGZpeCB3b3VsZCBzdGlsbCBiZSBuZWVkZWQuDQoNClJlZ2FyZHMsDQpBbWl0a3VtYXINCg==

2016-03-29 05:38:09

by James Cameron

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

On Tue, Mar 29, 2016 at 12:47:20PM +0800, Wei-Ning Huang wrote:
> "single skb allocation failure" happens when system is under heavy
> memory pressure. Add __GFP_REPEAT to skb allocation call so kernel
> attempts to reclaim pages and retry the allocation.

Oh, that's interesting, we're back to this symptom again.

Nice to see this fix.

Heavy memory pressure on 3.5 caused dev_alloc_skb failure in this
driver. Tracked at OLPC as #12694.

--
James Cameron
http://quozl.netrek.org/

2016-04-05 05:48:36

by Amitkumar Karwar

[permalink] [raw]
Subject: RE: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

SGkgRXJpYywNCg0KVGhhbmtzIGZvciB0aGUgY29tbWVudHMuDQoNCj4gRnJvbTogRXJpYyBEdW1h
emV0IFttYWlsdG86ZXJpYy5kdW1hemV0QGdtYWlsLmNvbV0NCj4gU2VudDogVHVlc2RheSwgTWFy
Y2ggMjksIDIwMTYgNjoyOSBQTQ0KPiBUbzogV2VpLU5pbmcgSHVhbmcNCj4gQ2M6IEthbGxlIFZh
bG87IExpbnV4IFdpcmVsZXNzOyBMS01MOyBBbWl0a3VtYXIgS2Fyd2FyOyBOaXNoYW50DQo+IFNh
cm11a2FkYW07IFNhbWVlciBOYW5kYTsgbmV0ZGV2QHZnZXIua2VybmVsLm9yZzsgU29ubnkgUmFv
OyBEb3VnbGFzDQo+IEFuZGVyc29uDQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0hdIG13aWZpZXg6IGFk
ZCBfX0dGUF9SRVBFQVQgdG8gc2tiIGFsbG9jYXRpb24gY2FsbA0KPiANCj4gT24gVHVlLCAyMDE2
LTAzLTI5IGF0IDE3OjI3ICswODAwLCBXZWktTmluZyBIdWFuZyB3cm90ZToNCj4gPiBBZGRpbmcg
c29tZSBjaHJvbWl1bSBkZXZzIHRvIHRoZSB0aHJlYWQuDQo+ID4NCj4gPiBJbiwgaHR0cDovL2x4
ci5mcmVlLWVsZWN0cm9ucy5jb20vc291cmNlL21tL3BhZ2VfYWxsb2MuYyNMMzE1Mg0KPiA+DQo+
ID4gVGhlIGRlZmF1bHQgbW0gcmV0cnkgYWxsb2NhdGlvbiB3aGVuICdvcmRlciA8PQ0KPiA+IFBB
R0VfQUxMT0NfQ09TVExZX09SREVSJyBvZiBnZnBfbWFzayBjb250YWlucyBfX0dGUF9SRVBFQVQu
DQo+ID4gUEFHRV9BTExPQ19DT1NUTFlfT1JERVIgaXMgZGVmaW5lZCB0byBiZSAzLiBPbiBzeXN0
ZW1zIHdpdGggcGFnZSBzaXplDQo+ID4gPSA0SywgdGhpcyBtZWFucyBtZW1vcnkgY29tcGFjdGlv
biBhbmQgcmV0cnkgaXMgb25seSBkb25lIHdoZW4gdGhlDQo+ID4gc2l6ZSBvZiBhbGxvY2F0aW9u
IGlzIDw9IDMySyBJbiBtd2lmaWV4LCB0aGUgYWxsb2NhdGlvbiBzaXplIGlzIDY0Sy4NCj4gDQo+
IA0KPiANCj4gPiAgV2hlbiB3ZSBoYXZlIHN5c3RlbSB3aXRoDQo+ID4gbWVtb3J5IGZyYWdtZW50
YXRpb24gYW5kIGFsbG9jYXRpb24gZmFpbGVkLCB0aGVyZSB3aWxsIGJlIG5vIHJldHJ5Lg0KPiA+
IFRoaXMgaXMgd2h5IHdlIG5lZWQgdG8gYWRkIF9fR0ZQX1JFUEVBVCBoZXJlIHRvIGFsbG93IHRo
ZSBzeXN0ZW0gdG8NCj4gPiBwZXJmb3JtIG1lbW9yeSBjb21wYWN0aW9uIGFuZCByZXRyeSBhbGxv
Y2F0aW9uLg0KPiA+DQo+ID4gTWF5YmUgQW1pdEBtYXJ2ZWxsIGNhbiBjb21tZW50IG9uIGlmIHRo
aXMgaXMgYSBnb29kIGZpeCBvbiB0aGlzIGlzc3VlLg0KPiA+IEknbSBhbHNvIGF3YXJlIHRoYXQg
bWFydmVsbCBpcyB0aGUgcHJvZ3Jlc3Mgb2YgaW1wbGVtZW50aW5nDQo+ID4gc2NhdHRlci9nYXR0
ZXIgZm9yIG13aWZpZXgsIHdoaWNoIGNhbiBhbHNvIGZpeCB0aGUgaXNzdWUuDQo+IA0KPiBCZWZv
cmUgU0cgaXMgaW1wbGVtZW50ZWQsIHlvdSByZWFsbHkgbmVlZCB0byBjb3B5IGluY29taW5nIGZy
YW1lcyBpbnRvDQo+IHNtYWxsZXN0IGNodW5rcyAodG8gZ2V0IGxvd2VzdCBza2ItPnRydWVzaXpl
KSBhbmQgbGVhdmUgdGhlIDY0S0INCj4gYWxsb2NhdGVkIHN0dWZmIGZvcmV2ZXIgaW4gdGhlIGRy
aXZlci4NCg0KV2UgZG8gaGF2ZSBhIDY0S0IgcHJlLWFsbG9jYXRlZCBidWZmZXIgZm9yIHJlY2Vp
dmluZyBSeCBkYXRhIGluIG91ciBkcml2ZXIuDQoNCj4gDQo+IF9fR0ZQX1JFUEVBVCB3b250IHJl
YWxseSBzb2x2ZSB0aGUgaXNzdWUuDQo+IA0KPiBJdCBzZWVtcyB0aGUgcHJvYmxlbSBjb21lcyBm
cm9tIHRoZSBmYWN0IHRoYXQgdGhlIGRyaXZlcnMgY2FsbHMNCj4gZGV2X2tmcmVlX3NrYl9hbnko
KSBhZnRlciBjYWxsaW5nIG13aWZpZXhfZGVhZ2dyX3NkaW9fcGt0KCksIGluc3RlYWQgb2YNCj4g
cmVjeWNsaW5nIHRoaXMgdmVyeSBwcmVjaW91cyA2NEtCIHNrYiBvbmNlIG1lbW9yeSBnZXRzIGZy
YWdtZW50ZWQuDQoNCk91ciBvbmUgdGltZSBhbGxvY2F0ZWQgNjRrIGJ1ZmZlciByZWFkIGZyb20g
ZmlybXdhcmUgY29udGFpbnMgbXVsdGlwbGUgZGF0YSBjaHVua3MuIFdlIGhhdmUgYSBmZWF0dXJl
IGNhbGxlZCBzaW5nbGUgcG9ydCBhZ2dyZWdhdGlvbiBpbiB3aGljaCBmaXJtd2FyZSBhdHRhY2hl
cyBhbiBhZ2dyZWdhdGVkIGJ1ZmZlciB0byBzaW5nbGUgcG9ydC4gU28gc29tZXRpbWVzIGEgc2lu
Z2xlIGRhdGEgY2h1bmsgY2FuIGV4Y2VlZCAzMmsuIGRldl9rZnJlZV9za2JfYW55KCkgaXMgY2Fs
bGVkIHRvIGZyZWUgdGhvc2UgZGF0YSBjaHVua3MuDQoNCj4gDQo+IEFub3RoZXIgcHJvYmxlbSBp
cyB0aGF0IG13aWZpZXhfZGVhZ2dyX3NkaW9fcGt0KCkgdXNlcw0KPiBtd2lmaWV4X2FsbG9jX2Rt
YV9hbGlnbl9idWYoKSB3aXRoIEdGUF9LRVJORUwgfCBHRlBfRE1BDQo+IA0KPiBSZWFsbHkgR0ZQ
X0RNQSBtYWtlcyBubyBzZW5zZSBoZXJlLCBzaW5jZSB0aGUgc2tiIGlzIGdvaW5nIHRvIGJlDQo+
IHByb2Nlc3NlZCBieSB0aGUgc3RhY2ssIHdoaWNoIGhhcyBubyBzdWNoIHJlcXVpcmVtZW50Lg0K
PiANCj4gUGxlYXNlIHVzZSBub3JtYWwgc2tiIGFsbG9jYXRpb25zIHRoZXJlLg0KDQpTdXJlLiBJ
IHdpbGwgc3VibWl0IGEgcGF0Y2ggZm9yIHRoaXMuDQoNClJlZ2FyZHMsDQpBbWl0a3VtYXINCg==

2016-04-05 18:24:49

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call

RnJvbTogQW1pdGt1bWFyIEthcndhcg0KPiBTZW50OiAwNSBBcHJpbCAyMDE2IDA2OjQ4DQouLi4N
Cj4gT3VyIG9uZSB0aW1lIGFsbG9jYXRlZCA2NGsgYnVmZmVyIHJlYWQgZnJvbSBmaXJtd2FyZSBj
b250YWlucyBtdWx0aXBsZSBkYXRhIGNodW5rcy4gV2UgaGF2ZSBhIGZlYXR1cmUNCj4gY2FsbGVk
IHNpbmdsZSBwb3J0IGFnZ3JlZ2F0aW9uIGluIHdoaWNoIGZpcm13YXJlIGF0dGFjaGVzIGFuIGFn
Z3JlZ2F0ZWQgYnVmZmVyIHRvIHNpbmdsZSBwb3J0LiBTbw0KPiBzb21ldGltZXMgYSBzaW5nbGUg
ZGF0YSBjaHVuayBjYW4gZXhjZWVkIDMyay4gZGV2X2tmcmVlX3NrYl9hbnkoKSBpcyBjYWxsZWQg
dG8gZnJlZSB0aG9zZSBkYXRhIGNodW5rcy4NCg0KQWggeWVzLCB3aGljaCBwYXJ0aWN1bGFyIHBy
b2JsZW0gZG9lcyBhZ2dyZWdhdGluZyBkYXRhIGludG8gYSBzaW5nbGUgYnVmZmVyIHNvbHZlPw0K
DQoJRGF2aWQNCg0K