2013-05-26 20:19:42

by Aaron Tomlin

[permalink] [raw]
Subject: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

From: Aaron Tomlin <[email protected]>

Since v1:
- Removed unnecessary parentheses

---8<---

Failed GFP_ATOMIC allocations by the network stack result in dropped
packets, which will be received on a subsequent retransmit, and an
unnecessary, noisy warning with a kernel backtrace.

These warnings are harmless, but they still cause users to panic and
file bug reports over dropped packets. It would be better to hide the
failed allocation warnings and backtraces, and let retransmits handle
dropped packets quietly.

Signed-off-by: Aaron Tomlin <[email protected]>
---
net/core/skbuff.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 02139d6..84aa870 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -236,7 +236,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
? skbuff_fclone_cache : skbuff_head_cache;

if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
- gfp_mask |= (__GFP_MEMALLOC|__GFP_NOWARN);
+ gfp_mask |= __GFP_MEMALLOC | __GFP_NOWARN;

/* Get the HEAD */
skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);
--
1.8.1.4


2013-05-27 17:39:22

by Rik van Riel

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On 05/26/2013 04:19 PM, [email protected] wrote:
> From: Aaron Tomlin <[email protected]>
>
> Since v1:
> - Removed unnecessary parentheses
>
> ---8<---
>
> Failed GFP_ATOMIC allocations by the network stack result in dropped
> packets, which will be received on a subsequent retransmit, and an
> unnecessary, noisy warning with a kernel backtrace.
>
> These warnings are harmless, but they still cause users to panic and
> file bug reports over dropped packets. It would be better to hide the
> failed allocation warnings and backtraces, and let retransmits handle
> dropped packets quietly.

Yes please. Getting memory management bug reports for
dropped network packets got old years ago. Lets get
rid of those messages.

> Signed-off-by: Aaron Tomlin <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>

2013-05-27 17:56:06

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

Hello.

On 27-05-2013 0:19, [email protected] wrote:

> From: Aaron Tomlin <[email protected]>

> Since v1:
> - Removed unnecessary parentheses

The "changes since version X" section should typically go after ---
tearline, else the mainatiner will have to edit your patch before applying.

> ---8<---

> Failed GFP_ATOMIC allocations by the network stack result in dropped
> packets, which will be received on a subsequent retransmit, and an
> unnecessary, noisy warning with a kernel backtrace.

> These warnings are harmless, but they still cause users to panic and
> file bug reports over dropped packets. It would be better to hide the
> failed allocation warnings and backtraces, and let retransmits handle
> dropped packets quietly.

> Signed-off-by: Aaron Tomlin <[email protected]>

WBR, Sergei

2013-05-27 22:25:24

by Eric Dumazet

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On Mon, 2013-05-27 at 13:39 -0400, Rik van Riel wrote:
> On 05/26/2013 04:19 PM, [email protected] wrote:
> > From: Aaron Tomlin <[email protected]>
> >
> > Since v1:
> > - Removed unnecessary parentheses
> >
> > ---8<---
> >
> > Failed GFP_ATOMIC allocations by the network stack result in dropped
> > packets, which will be received on a subsequent retransmit, and an
> > unnecessary, noisy warning with a kernel backtrace.

This claim is wrong, only some protocols deal with retransmits.

> >
> > These warnings are harmless, but they still cause users to panic and
> > file bug reports over dropped packets. It would be better to hide the
> > failed allocation warnings and backtraces, and let retransmits handle
> > dropped packets quietly.
>
> Yes please. Getting memory management bug reports for
> dropped network packets got old years ago. Lets get
> rid of those messages.

I am only wondering why this path has anything needing special
attention, over thousands of kmalloc() like call sites in the kernel.

If mm allocation warnings are useless, just make __GFP_NOWARN the
default, and save us thousand of patches (adding the __GFP_NOWARN
everywhere)

Truth is : some network drivers don't deal very well with allocation
errors. mlx4 for example absolutely wants order-2 pages in RX path, with
no fallback to order-0 pages.

So I am not against this patch, but I can not really acknowledge it,
sorry.


2013-05-28 04:31:49

by Joe Perches

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On Mon, 2013-05-27 at 15:25 -0700, Eric Dumazet wrote:
> On Mon, 2013-05-27 at 13:39 -0400, Rik van Riel wrote:
> > On 05/26/2013 04:19 PM, [email protected] wrote:
> > > Failed GFP_ATOMIC allocations by the network stack result in dropped
> > > packets, which will be received on a subsequent retransmit, and an
> > > unnecessary, noisy warning with a kernel backtrace.
>
> This claim is wrong, only some protocols deal with retransmits.
>
> > > These warnings are harmless, but they still cause users to panic and
> > > file bug reports over dropped packets. It would be better to hide the
> > > failed allocation warnings and backtraces, and let retransmits handle
> > > dropped packets quietly.
> >
> > Yes please. Getting memory management bug reports for
> > dropped network packets got old years ago. Lets get
> > rid of those messages.
>
> I am only wondering why this path has anything needing special
> attention, over thousands of kmalloc() like call sites in the kernel.

I don't think this site is particularly special.

> If mm allocation warnings are useless, just make __GFP_NOWARN the
> default, and save us thousand of patches (adding the __GFP_NOWARN
> everywhere)
>
> Truth is : some network drivers don't deal very well with allocation
> errors. mlx4 for example absolutely wants order-2 pages in RX path, with
> no fallback to order-0 pages.
>
> So I am not against this patch, but I can not really acknowledge it,
> sorry.

I think the __alloc_skb alloc failure message is ok,
but maybe there shouldn't be something "scary" like
a dump_stack.

Maybe this site should use a trivial debug error
message like below instead.
---
net/core/skbuff.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d629891..0154803 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -231,17 +231,24 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
struct sk_buff *skb;
u8 *data;
bool pfmemalloc;
+ bool warn_no_skb = !(gfp_mask & __GFP_NOWARN);

cache = (flags & SKB_ALLOC_FCLONE)
? skbuff_fclone_cache : skbuff_head_cache;

if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
- gfp_mask |= __GFP_MEMALLOC;
+ gfp_mask |= __GFP_MEMALLOC | __GFP_NOWARN;

/* Get the HEAD */
skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);
- if (!skb)
+ if (!skb) {
+ if (warn_no_skb)
+ printk_ratelimited(KERN_DEBUG "%s: OOM from %pF via %pF\n",
+ __func__,
+ __builtin_return_address(0),
+ __builtin_return_address(1));
goto out;
+ }
prefetchw(skb);

/* We do our best to align skb_shared_info on a separate cache

2013-05-28 06:04:19

by Eric Dumazet

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On Mon, 2013-05-27 at 21:31 -0700, Joe Perches wrote:

> I think the __alloc_skb alloc failure message is ok,
> but maybe there shouldn't be something "scary" like
> a dump_stack.
>
> Maybe this site should use a trivial debug error
> message like below instead.
> ---

Oh well.

If dump_stack are scary, they are scary for every k[mz]alloc() users,
not only __alloc_skb_alloc()

I just said : Please do not add GFP_NOWARN to thousand of call sites,
and you suggest adding more code in network fast path. (???)

This is not a trivial code, we are speaking of a very sensitive one.

Let mm guys explain in what cases a full stack trace is nice to have,
and in what cases its useless. An heuristic should be defined in mm tree
for that, and not spread everywhere.

There must be a reason GFP_NOWARN is seldom used in the kernel, even if
most callers are able to recover properly from a failed memory
allocation.

2013-05-28 06:08:54

by Joe Perches

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On Mon, 2013-05-27 at 23:04 -0700, Eric Dumazet wrote:
> If dump_stack are scary, they are scary for every k[mz]alloc() users,
> not only __alloc_skb_alloc()

True, but perhaps they are relatively rarer though
for non net use cases.

2013-05-28 17:15:45

by Debabrata Banerjee

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On Tue, May 28, 2013 at 12:31 AM, Joe Perches <[email protected]> wrote:
>
> I think the __alloc_skb alloc failure message is ok,
> but maybe there shouldn't be something "scary" like
> a dump_stack.
>
> Maybe this site should use a trivial debug error
> message like below instead.

The stack trace may or may not be important for debugging, however
what is important is that we know how often this is happening. That
implies that there should be a percpu counter of allocation failures
here, regardless of a dump_stack, rate limited or not. I would suppose
it's ok for no stack by default as long as there is a counter, for
this specific call only.

There are some scary ways this code is called, for example from
tcp_send_fin(). It does yield() in that loop, but otherwise it just
spins.

-Deb

2013-05-29 07:37:00

by Rik van Riel

[permalink] [raw]
Subject: Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

On 05/27/2013 06:25 PM, Eric Dumazet wrote:
> On Mon, 2013-05-27 at 13:39 -0400, Rik van Riel wrote:


>> Yes please. Getting memory management bug reports for
>> dropped network packets got old years ago. Lets get
>> rid of those messages.
>
> I am only wondering why this path has anything needing special
> attention, over thousands of kmalloc() like call sites in the kernel.

There are a few special things about the network code:

1) network packets can arrive extremely fast, in
large batches
2) the network code cannot wait for the VM to free
memory (GFP_ATOMIC)

Other allocations tend to be done less at a time, and/or
allow the VM to free up memory before proceeding.

> If mm allocation warnings are useless, just make __GFP_NOWARN the
> default, and save us thousand of patches (adding the __GFP_NOWARN
> everywhere)
>
> Truth is : some network drivers don't deal very well with allocation
> errors. mlx4 for example absolutely wants order-2 pages in RX path, with
> no fallback to order-0 pages.

Network protocols and network applications tend to deal
with packet loss by retransmitting data, though.

Also, once all the data from one of those order-2 page
buffers has been delivered or forwarded, that buffer
becomes available to subsequent network packets.

Other allocations tend not to free & reuse their
memory as quickly as the network stack.

> So I am not against this patch, but I can not really acknowledge it,
> sorry.
>
>
>