2015-11-22 08:44:48

by Shunqian Zheng

[permalink] [raw]
Subject: [RFC PATCH v1] Trying to fix the stmmac memory leak during suspend/resume

From: ZhengShunQian <[email protected]>

When I run Suspend-to-Ram stress test on my Rockchip RK3288(SoC) board
that integrated stmmac ethernet, it always OOM after a few iterations,
usually 50 times is enough to reproduce.

Compiled kernel with KMEMLEAK feature, I got the logs as below:
unreferenced object 0xed89ac00 (size 192):
comm "busybox", pid 79, jiffies 2251 (age 54.580s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 d1 ed 00 00 00 00 00 00 00 00 ................
backtrace:
[<c05430dc>] kmemleak_alloc+0x44/0x78
[<c012b198>] kmem_cache_alloc+0x1ac/0x264
[<c0442134>] __build_skb+0x38/0x9c
[<c0442350>] __netdev_alloc_skb+0xac/0x118
[<c0335f8c>] init_dma_desc_rings+0xcc/0x474
[<c0336f20>] stmmac_resume+0xc4/0x14c
[<c033ed28>] stmmac_pltfr_resume+0x3c/0x40
[<c02ed4dc>] platform_pm_resume+0x3c/0x50
[<c02f7ad0>] dpm_run_callback+0x7c/0x160
[<c02f7e14>] device_resume+0x174/0x1c0
[<c02f9368>] dpm_resume+0x110/0x2cc
[<c02f9830>] dpm_resume_end+0x1c/0x28
[<c00646c0>] suspend_devices_and_enter+0x53c/0x6ec
[<c0064ba4>] pm_suspend+0x334/0x478
[<c0063110>] state_store+0xac/0xc8
[<c0265228>] kobj_attr_store+0x1c/0x28

Actually I don't think I know net/stmmac good enough to fix this bug.
I really appreciate that the exports of net/stmmac can take over it if
you think it is a bug too.

ZhengShunQian (1):
net: stmmac: Free rx_skbufs before realloc

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

--
1.9.1


2015-11-22 08:44:58

by Shunqian Zheng

[permalink] [raw]
Subject: [PATCH v1] net: stmmac: Free rx_skbufs before realloc

From: ZhengShunQian <[email protected]>

The init_dma_desc_rings() may realloc the rx_skbuff[] when
suspend and resume. This patch free the rx_skbuff[] before
reallocing memory.

Signed-off-by: ZhengShunQian <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 64d8aa4..2af1ed9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1022,6 +1022,14 @@ static void stmmac_free_rx_buffers(struct stmmac_priv *priv, int i)
priv->rx_skbuff[i] = NULL;
}

+static void dma_free_rx_skbufs(struct stmmac_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->dma_rx_size; i++)
+ stmmac_free_rx_buffers(priv, i);
+}
+
/**
* init_dma_desc_rings - init the RX/TX descriptor rings
* @dev: net device structure
@@ -1058,6 +1066,8 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
/* RX INITIALIZATION */
pr_debug("\tSKB addresses:\nskb\t\tskb data\tdma data\n");
}
+
+ dma_free_rx_skbufs(priv);
for (i = 0; i < rxsize; i++) {
struct dma_desc *p;
if (priv->extend_desc)
@@ -1122,14 +1132,6 @@ err_init_rx_buffers:
return ret;
}

-static void dma_free_rx_skbufs(struct stmmac_priv *priv)
-{
- int i;
-
- for (i = 0; i < priv->dma_rx_size; i++)
- stmmac_free_rx_buffers(priv, i);
-}
-
static void dma_free_tx_skbufs(struct stmmac_priv *priv)
{
int i;
--
1.9.1

2015-11-24 18:09:29

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v1] net: stmmac: Free rx_skbufs before realloc

From: Shunqian Zheng <[email protected]>
Date: Sun, 22 Nov 2015 16:44:18 +0800

> From: ZhengShunQian <[email protected]>
>
> The init_dma_desc_rings() may realloc the rx_skbuff[] when
> suspend and resume. This patch free the rx_skbuff[] before
> reallocing memory.
>
> Signed-off-by: ZhengShunQian <[email protected]>

This isn't really the right way to fix this.

I see two reasonable approaches:

1) suspend liberates the RX ring, although this approach is less
desirable

2) resume doesn't try to allocate already populated RX ring
entries

Freeing the whole RX ring just to allocate it again immediately
makes no sense at all and is wasteful work.

2015-11-25 15:13:48

by Peppe CAVALLARO

[permalink] [raw]
Subject: Re: [PATCH v1] net: stmmac: Free rx_skbufs before realloc

Hello

On 11/24/2015 7:09 PM, David Miller wrote:
> From: Shunqian Zheng <[email protected]>
> Date: Sun, 22 Nov 2015 16:44:18 +0800
>
>> From: ZhengShunQian <[email protected]>
>>
>> The init_dma_desc_rings() may realloc the rx_skbuff[] when
>> suspend and resume. This patch free the rx_skbuff[] before
>> reallocing memory.
>>
>> Signed-off-by: ZhengShunQian <[email protected]>
>
> This isn't really the right way to fix this.
>
> I see two reasonable approaches:
>
> 1) suspend liberates the RX ring, although this approach is less
> desirable
>
> 2) resume doesn't try to allocate already populated RX ring
> entries
>
> Freeing the whole RX ring just to allocate it again immediately
> makes no sense at all and is wasteful work.

This is a bug in this driver version that, to be honest, we fixed with
the first approach on STi kernel.
The patch just called the dma_free_rx_skbufs(priv) in the suspend.
I can give you this patch that is tested on my side too.
But! I do think we should move on second approach.
Indeed, also on ST platforms, when we play with suspend states
the DDR although in self-refresh the data are not lost at all.
No reason to free and reallocate all in suspend/resume.
I can test that and then provide another patch to this mailing list
asap.

Let me know.
peppe

2015-11-26 10:26:42

by Peppe CAVALLARO

[permalink] [raw]
Subject: Re: [PATCH v1] net: stmmac: Free rx_skbufs before realloc

On 11/25/2015 4:13 PM, Giuseppe CAVALLARO wrote:
> Hello
>
> On 11/24/2015 7:09 PM, David Miller wrote:
>> From: Shunqian Zheng <[email protected]>
>> Date: Sun, 22 Nov 2015 16:44:18 +0800
>>
>>> From: ZhengShunQian <[email protected]>
>>>
>>> The init_dma_desc_rings() may realloc the rx_skbuff[] when
>>> suspend and resume. This patch free the rx_skbuff[] before
>>> reallocing memory.
>>>
>>> Signed-off-by: ZhengShunQian <[email protected]>
>>
>> This isn't really the right way to fix this.
>>
>> I see two reasonable approaches:
>>
>> 1) suspend liberates the RX ring, although this approach is less
>> desirable
>>
>> 2) resume doesn't try to allocate already populated RX ring
>> entries
>>
>> Freeing the whole RX ring just to allocate it again immediately
>> makes no sense at all and is wasteful work.
>
> This is a bug in this driver version that, to be honest, we fixed with
> the first approach on STi kernel.
> The patch just called the dma_free_rx_skbufs(priv) in the suspend.
> I can give you this patch that is tested on my side too.
> But! I do think we should move on second approach.
> Indeed, also on ST platforms, when we play with suspend states
> the DDR although in self-refresh the data are not lost at all.
> No reason to free and reallocate all in suspend/resume.
> I can test that and then provide another patch to this mailing list
> asap.

I have just send the patch (directly for approach #2).

Peppe

>
> Let me know.
> peppe
>
>
>