From: Aaron Tomlin <[email protected]>
Since v1:
- Removed unnecessary parentheses (sergei.shtylyov)
---8<---
Failed GFP_ATOMIC allocations by the network stack result in dropped
packets, which will be received on a subsequent retransmit, and an
unnecessary, noisy warning with a kernel backtrace.
These warnings are harmless, but they still cause users to panic and
file bug reports over dropped packets. It would be better to hide the
failed allocation warnings and backtraces, and let retransmits handle
dropped packets quietly.
Signed-off-by: Aaron Tomlin <[email protected]>
---
net/core/skbuff.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index af9185d..84aa870 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -236,7 +236,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
? skbuff_fclone_cache : skbuff_head_cache;
if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
- gfp_mask |= __GFP_MEMALLOC;
+ gfp_mask |= __GFP_MEMALLOC | __GFP_NOWARN;
/* Get the HEAD */
skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);
--
1.8.1.4
On Sun, May 26, 2013 at 09:45:01PM +0100, [email protected] wrote:
> From: Aaron Tomlin <[email protected]>
>
> Since v1:
> - Removed unnecessary parentheses (sergei.shtylyov)
>
> ---8<---
>
> Failed GFP_ATOMIC allocations by the network stack result in dropped
> packets, which will be received on a subsequent retransmit, and an
> unnecessary, noisy warning with a kernel backtrace.
>
> These warnings are harmless, but they still cause users to panic and
> file bug reports over dropped packets. It would be better to hide the
> failed allocation warnings and backtraces, and let retransmits handle
> dropped packets quietly.
>
> Signed-off-by: Aaron Tomlin <[email protected]>
> ---
Acked-by: Rafael Aquini <[email protected]>
> net/core/skbuff.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index af9185d..84aa870 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -236,7 +236,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
> ? skbuff_fclone_cache : skbuff_head_cache;
>
> if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
> - gfp_mask |= __GFP_MEMALLOC;
> + gfp_mask |= __GFP_MEMALLOC | __GFP_NOWARN;
>
> /* Get the HEAD */
> skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);
> --
> 1.8.1.4
>
[email protected] <[email protected]> :
[...]
> Failed GFP_ATOMIC allocations by the network stack result in dropped
> packets, which will be received on a subsequent retransmit, and an
> unnecessary, noisy warning with a kernel backtrace.
>
> These warnings are harmless, but they still cause users to panic and
> file bug reports over dropped packets. It would be better to hide the
> failed allocation warnings and backtraces, and let retransmits handle
> dropped packets quietly.
Linux VM may be perfect but device drivers do stupid things.
Please don't paper over it just because some shit ends in your backyard.
--
Ueimor
On 05/27/2013 03:41 PM, Francois Romieu wrote:
> [email protected] <[email protected]> :
> [...]
>> Failed GFP_ATOMIC allocations by the network stack result in dropped
>> packets, which will be received on a subsequent retransmit, and an
>> unnecessary, noisy warning with a kernel backtrace.
>>
>> These warnings are harmless, but they still cause users to panic and
>> file bug reports over dropped packets. It would be better to hide the
>> failed allocation warnings and backtraces, and let retransmits handle
>> dropped packets quietly.
>
> Linux VM may be perfect but device drivers do stupid things.
>
> Please don't paper over it just because some shit ends in your backyard.
We should rate-limit these messages at least. When a system is low on memory
the logs can quickly fill up with useless OOM messages, further slowing
the system...
Ben
>
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com
On Tue, May 28, 2013 at 09:00:45AM -0700, Ben Greear wrote:
> On 05/27/2013 03:41 PM, Francois Romieu wrote:
> >[email protected] <[email protected]> :
> >[...]
> >>Failed GFP_ATOMIC allocations by the network stack result in dropped
> >>packets, which will be received on a subsequent retransmit, and an
> >>unnecessary, noisy warning with a kernel backtrace.
> >>
> >>These warnings are harmless, but they still cause users to panic and
> >>file bug reports over dropped packets. It would be better to hide the
> >>failed allocation warnings and backtraces, and let retransmits handle
> >>dropped packets quietly.
> >
> >Linux VM may be perfect but device drivers do stupid things.
> >
> >Please don't paper over it just because some shit ends in your backyard.
>
> We should rate-limit these messages at least. When a system is low on memory
> the logs can quickly fill up with useless OOM messages, further slowing
> the system...
>
The real problem seems to be that more and more the network stack (drivers, perhaps)
is relying on chunks of contiguous page-blocks without a fallback mechanism to
order-0 page allocations. When memory gets fragmented, these alloc failures
start to pop up more often and they scare ordinary sysadmins out of their paints.
The big point of this change was to attempt to relief some of these warnings
which we believed as being useless, since the net stack would recover from it
by re-transmissions.
We might have misjudged the scenario, though. Perhaps a better approach would be
making the warning less verbose for all page-alloc failures. We could, perhaps,
only print a stack-dump out, if some debug flag is passed along, either as
reference, or by some CONFIG_DEBUG_ preprocessor directive.
Rafael
> Ben
>
> >
>
>
> --
> Ben Greear <[email protected]>
> Candela Technologies Inc http://www.candelatech.com
>
On 05/28/2013 09:15 AM, Rafael Aquini wrote:
> On Tue, May 28, 2013 at 09:00:45AM -0700, Ben Greear wrote:
>> On 05/27/2013 03:41 PM, Francois Romieu wrote:
>>> [email protected] <[email protected]> :
>>> [...]
>>>> Failed GFP_ATOMIC allocations by the network stack result in dropped
>>>> packets, which will be received on a subsequent retransmit, and an
>>>> unnecessary, noisy warning with a kernel backtrace.
>>>>
>>>> These warnings are harmless, but they still cause users to panic and
>>>> file bug reports over dropped packets. It would be better to hide the
>>>> failed allocation warnings and backtraces, and let retransmits handle
>>>> dropped packets quietly.
>>>
>>> Linux VM may be perfect but device drivers do stupid things.
>>>
>>> Please don't paper over it just because some shit ends in your backyard.
>>
>> We should rate-limit these messages at least. When a system is low on memory
>> the logs can quickly fill up with useless OOM messages, further slowing
>> the system...
>>
>
> The real problem seems to be that more and more the network stack (drivers, perhaps)
> is relying on chunks of contiguous page-blocks without a fallback mechanism to
> order-0 page allocations. When memory gets fragmented, these alloc failures
> start to pop up more often and they scare ordinary sysadmins out of their paints.
>
> The big point of this change was to attempt to relief some of these warnings
> which we believed as being useless, since the net stack would recover from it
> by re-transmissions.
> We might have misjudged the scenario, though. Perhaps a better approach would be
> making the warning less verbose for all page-alloc failures. We could, perhaps,
> only print a stack-dump out, if some debug flag is passed along, either as
> reference, or by some CONFIG_DEBUG_ preprocessor directive.
I have seen the logs spam with 0rder-0 allocation errors. Maybe the system had
legitimate issues, but continuously spamming made it even harder to figure out
the problem, and constantly trying to write that much text to the serial console
has a big performance impact, further slowing the system when it should instead
be clearing it's packet backlog or whatever.
Maybe print the first OOM message with lots of details, and then use
some rate-limiting stuff to print out summary details at most every 5 seconds
or so after that. Could reset the verbose timer after some period of no
OOM messages.
Ben
>
> Rafael
>
>> Ben
>>
>>>
>>
>>
>> --
>> Ben Greear <[email protected]>
>> Candela Technologies Inc http://www.candelatech.com
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com
On Tue, 2013-05-28 at 13:15 -0300, Rafael Aquini wrote:
> The real problem seems to be that more and more the network stack (drivers, perhaps)
> is relying on chunks of contiguous page-blocks without a fallback mechanism to
> order-0 page allocations. When memory gets fragmented, these alloc failures
> start to pop up more often and they scare ordinary sysadmins out of their paints.
>
Where do you see that ?
I see exactly the opposite trend.
We have less and less buggy drivers, and we want to catch last
offenders.
> The big point of this change was to attempt to relief some of these warnings
> which we believed as being useless, since the net stack would recover from it
> by re-transmissions.
> We might have misjudged the scenario, though. Perhaps a better approach would be
> making the warning less verbose for all page-alloc failures. We could, perhaps,
> only print a stack-dump out, if some debug flag is passed along, either as
> reference, or by some CONFIG_DEBUG_ preprocessor directive.
warn_alloc_failed() uses the standard DEFAULT_RATELIMIT_INTERVAL which
is very small (5 * HZ)
I would bump nopage_rs to somethin more reasonable, like one hour or one
day.
On Tue, 2013-05-28 at 09:00 -0700, Ben Greear wrote:
> On 05/27/2013 03:41 PM, Francois Romieu wrote:
> > [email protected] <[email protected]> :
> > [...]
> >> Failed GFP_ATOMIC allocations by the network stack result in dropped
> >> packets, which will be received on a subsequent retransmit, and an
> >> unnecessary, noisy warning with a kernel backtrace.
[]
> > Please don't paper over it just because some shit ends in your backyard.
> We should rate-limit these messages at least.
Already done.
Look in mm/page_alloc:warn_alloc_failed()
void warn_alloc_failed(gfp_t gfp_mask, int order, const char *fmt, ...)
{
unsigned int filter = SHOW_MEM_FILTER_NODES;
if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) ||
debug_guardpage_minorder() > 0)
return;
On Tue, May 28, 2013 at 09:29:37AM -0700, Eric Dumazet wrote:
> On Tue, 2013-05-28 at 13:15 -0300, Rafael Aquini wrote:
>
> > The real problem seems to be that more and more the network stack (drivers, perhaps)
> > is relying on chunks of contiguous page-blocks without a fallback mechanism to
> > order-0 page allocations. When memory gets fragmented, these alloc failures
> > start to pop up more often and they scare ordinary sysadmins out of their paints.
> >
>
> Where do you see that ?
>
> I see exactly the opposite trend.
>
> We have less and less buggy drivers, and we want to catch last
> offenders.
>
Perhaps the explanation is because we're looking into old stuff bad effects,
then. But just to list a few for your appreciation:
--------------------------------------------------------
Apr 23 11:25:31 217-IDC kernel: httpd: page allocation failure. order:1,
mode:0x20 Apr 23 11:25:31 217-IDC kernel: Pid: 19747, comm: httpd Not tainted
2.6.32-358.2.1.el6.x86_64 #1 Apr 23 11:25:31 217-IDC kernel: Call Trace: Apr 23
11:25:31 217-IDC kernel: <IRQ> [<ffffffff8112c207>] ?
__alloc_pages_nodemask+0x757/0x8d0 Apr 23 11:25:31 217-IDC kernel:
[<ffffffffa0337361>] ? bond_start_xmit+0x2f1/0x5d0 [bonding]
....
--------------------------------------------------------
Apr 4 18:51:32 exton kernel: swapper: page allocation failure. order:1,
mode:0x20
Apr 4 18:51:32 exton kernel: Pid: 0, comm: swapper Not tainted
2.6.32-279.19.1.el6.x86_64 #1
Apr 4 18:51:32 exton kernel: Call Trace:
Apr 4 18:51:32 exton kernel: <IRQ> [<ffffffff811231ff>] ?
__alloc_pages_nodemask+0x77f/0x940
Apr 4 18:51:32 exton kernel: [<ffffffff8115d1a2>] ? kmem_getpages+0x62/0x170
Apr 4 18:51:32 exton kernel: [<ffffffff8115ddba>] ? fallback_alloc+0x1ba/0x270
Apr 4 18:51:32 exton kernel: [<ffffffff8115d80f>] ? cache_grow+0x2cf/0x320
Apr 4 18:51:32 exton kernel: [<ffffffff8115db39>] ?
____cache_alloc_node+0x99/0x160
Apr 4 18:51:32 exton kernel: [<ffffffff8115ed00>] ?
kmem_cache_alloc_node_trace+0x90/0x200
Apr 4 18:51:32 exton kernel: [<ffffffff8115ef1d>] ? __kmalloc_node+0x4d/0x60
Apr 4 18:51:32 exton kernel: [<ffffffff8141ea1d>] ? __alloc_skb+0x6d/0x190
Apr 4 18:51:32 exton kernel: [<ffffffff8141eb5d>] ? dev_alloc_skb+0x1d/0x40
Apr 4 18:51:32 exton kernel: [<ffffffffa04f5f50>] ?
ipoib_cm_alloc_rx_skb+0x30/0x430 [ib_ipoib]
Apr 4 18:51:32 exton kernel: [<ffffffffa04f71ef>] ?
ipoib_cm_handle_rx_wc+0x29f/0x770 [ib_ipoib]
Apr 4 18:51:32 exton kernel: [<ffffffffa03c6a46>] ? mlx4_ib_poll_cq+0x2c6/0x7f0
[mlx4_ib]
....
--------------------------------------------------------
May 14 09:00:34 ifil03 kernel: swapper: page allocation failure. order:1,
mode:0x20
May 14 09:00:34 ifil03 kernel: Pid: 0, comm: swapper Not tainted
2.6.32-220.el6.x86_64 #1
May 14 09:00:34 ifil03 kernel: Call Trace:
May 14 09:00:34 ifil03 kernel: <IRQ> [<ffffffff81123f0f>] ?
__alloc_pages_nodemask+0x77f/0x940
May 14 09:00:34 ifil03 kernel: [<ffffffff8115ddc2>] ? kmem_getpages+0x62/0x170
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e9da>] ? fallback_alloc+0x1ba/0x270
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e42f>] ? cache_grow+0x2cf/0x320
May 14 09:00:34 ifil03 kernel: [<ffffffff8115e759>] ?
____cache_alloc_node+0x99/0x160
May 14 09:00:34 ifil03 kernel: [<ffffffff8115f53b>] ?
kmem_cache_alloc+0x11b/0x190
May 14 09:00:34 ifil03 kernel: [<ffffffff8141f528>] ? sk_prot_alloc+0x48/0x1c0
May 14 09:00:34 ifil03 kernel: [<ffffffff8141f7b2>] ? sk_clone+0x22/0x2e0
May 14 09:00:34 ifil03 kernel: [<ffffffff8146ca26>] ? inet_csk_clone+0x16/0xd0
May 14 09:00:34 ifil03 kernel: [<ffffffff814858f3>] ?
tcp_create_openreq_child+0x23/0x450
May 14 09:00:34 ifil03 kernel: [<ffffffff814832dd>] ?
tcp_v4_syn_recv_sock+0x4d/0x2a0
May 14 09:00:34 ifil03 kernel: [<ffffffff814856b1>] ? tcp_check_req+0x201/0x420
May 14 09:00:34 ifil03 kernel: [<ffffffff8147b166>] ?
tcp_rcv_state_process+0x116/0xa30
May 14 09:00:34 ifil03 kernel: [<ffffffff81482cfb>] ? tcp_v4_do_rcv+0x35b/0x430
May 14 09:00:34 ifil03 kernel: [<ffffffff81484471>] ? tcp_v4_rcv+0x4e1/0x860
May 14 09:00:34 ifil03 kernel: [<ffffffff814621fd>] ?
ip_local_deliver_finish+0xdd/0x2d0
May 14 09:00:34 ifil03 kernel: [<ffffffff81462488>] ? ip_local_deliver+0x98/0xa0
May 14 09:00:34 ifil03 kernel: [<ffffffff8146194d>] ? ip_rcv_finish+0x12d/0x440
May 14 09:00:34 ifil03 kernel: [<ffffffff8101bd86>] ?
intel_pmu_enable_all+0xa6/0x150
May 14 09:00:34 ifil03 kernel: [<ffffffff81461ed5>] ? ip_rcv+0x275/0x350
May 14 09:00:34 ifil03 kernel: [<ffffffff8142bedb>] ?
__netif_receive_skb+0x49b/0x6e0
May 14 09:00:34 ifil03 kernel: [<ffffffff8142df88>] ?
netif_receive_skb+0x58/0x60
May 14 09:00:34 ifil03 kernel: [<ffffffffa00a0a9e>] ?
vmxnet3_rq_rx_complete+0x36e/0x880 [vmxnet3]
....
--------------------------------------------------------
> > The big point of this change was to attempt to relief some of these warnings
> > which we believed as being useless, since the net stack would recover from it
> > by re-transmissions.
> > We might have misjudged the scenario, though. Perhaps a better approach would be
> > making the warning less verbose for all page-alloc failures. We could, perhaps,
> > only print a stack-dump out, if some debug flag is passed along, either as
> > reference, or by some CONFIG_DEBUG_ preprocessor directive.
>
>
> warn_alloc_failed() uses the standard DEFAULT_RATELIMIT_INTERVAL which
> is very small (5 * HZ)
>
> I would bump nopage_rs to somethin more reasonable, like one hour or one
> day.
>
Neat! Worth to try, no doubts about that. Aaron?
Cheers!
-- Rafael
On Tue, 2013-05-28 at 14:43 -0300, Rafael Aquini wrote:
>
> Perhaps the explanation is because we're looking into old stuff bad effects,
> then. But just to list a few for your appreciation:
> --------------------------------------------------------
> --------------------------------------------------------
> May 14 09:00:34 ifil03 kernel: swapper: page allocation failure. order:1,
> mode:0x20
> May 14 09:00:34 ifil03 kernel: Pid: 0, comm: swapper Not tainted
> 2.6.32-220.el6.x86_64 #1
> May 14 09:00:34 ifil03 kernel: Call Trace:
> May 14 09:00:34 ifil03 kernel: <IRQ> [<ffffffff81123f0f>] ?
> __alloc_pages_nodemask+0x77f/0x940
> May 14 09:00:34 ifil03 kernel: [<ffffffff8115ddc2>] ? kmem_getpages+0x62/0x170
> May 14 09:00:34 ifil03 kernel: [<ffffffff8115e9da>] ? fallback_alloc+0x1ba/0x270
> May 14 09:00:34 ifil03 kernel: [<ffffffff8115e42f>] ? cache_grow+0x2cf/0x320
> May 14 09:00:34 ifil03 kernel: [<ffffffff8115e759>] ?
> ____cache_alloc_node+0x99/0x160
> May 14 09:00:34 ifil03 kernel: [<ffffffff8115f53b>] ?
> kmem_cache_alloc+0x11b/0x190
> May 14 09:00:34 ifil03 kernel: [<ffffffff8141f528>] ? sk_prot_alloc+0x48/0x1c0
> May 14 09:00:34 ifil03 kernel: [<ffffffff8141f7b2>] ? sk_clone+0x22/0x2e0
> May 14 09:00:34 ifil03 kernel: [<ffffffff8146ca26>] ? inet_csk_clone+0x16/0xd0
> May 14 09:00:34 ifil03 kernel: [<ffffffff814858f3>] ?
> tcp_create_openreq_child+0x23/0x450
> May 14 09:00:34 ifil03 kernel: [<ffffffff814832dd>] ?
> tcp_v4_syn_recv_sock+0x4d/0x2a0
> May 14 09:00:34 ifil03 kernel: [<ffffffff814856b1>] ? tcp_check_req+0x201/0x420
> May 14 09:00:34 ifil03 kernel: [<ffffffff8147b166>] ?
> tcp_rcv_state_process+0x116/0xa30
> May 14 09:00:34 ifil03 kernel: [<ffffffff81482cfb>] ? tcp_v4_do_rcv+0x35b/0x430
> May 14 09:00:34 ifil03 kernel: [<ffffffff81484471>] ? tcp_v4_rcv+0x4e1/0x860
> May 14 09:00:34 ifil03 kernel: [<ffffffff814621fd>] ?
> ip_local_deliver_finish+0xdd/0x2d0
> May 14 09:00:34 ifil03 kernel: [<ffffffff81462488>] ? ip_local_deliver+0x98/0xa0
> May 14 09:00:34 ifil03 kernel: [<ffffffff8146194d>] ? ip_rcv_finish+0x12d/0x440
> May 14 09:00:34 ifil03 kernel: [<ffffffff8101bd86>] ?
> intel_pmu_enable_all+0xa6/0x150
> May 14 09:00:34 ifil03 kernel: [<ffffffff81461ed5>] ? ip_rcv+0x275/0x350
> May 14 09:00:34 ifil03 kernel: [<ffffffff8142bedb>] ?
> __netif_receive_skb+0x49b/0x6e0
> May 14 09:00:34 ifil03 kernel: [<ffffffff8142df88>] ?
> netif_receive_skb+0x58/0x60
> May 14 09:00:34 ifil03 kernel: [<ffffffffa00a0a9e>] ?
> vmxnet3_rq_rx_complete+0x36e/0x880 [vmxnet3]
> ....
> --
I hope you do realize this path has nothing to do with skb allocation ,
but socket cloning ?
On Tue, 2013-05-28 at 14:43 -0300, Rafael Aquini wrote:
> Perhaps the explanation is because we're looking into old stuff bad effects,
> then. But just to list a few for your appreciation:
> --------------------------------------------------------
> Apr 23 11:25:31 217-IDC kernel: httpd: page allocation failure. order:1,
> mode:0x20 Apr 23 11:25:31 217-IDC kernel: Pid: 19747, comm: httpd Not tainted
> 2.6.32-358.2.1.el6.x86_64 #1 Apr 23 11:25:31 217-IDC kernel: Call Trace: Apr 23
> 11:25:31 217-IDC kernel: <IRQ> [<ffffffff8112c207>] ?
> __alloc_pages_nodemask+0x757/0x8d0 Apr 23 11:25:31 217-IDC kernel:
> [<ffffffffa0337361>] ? bond_start_xmit+0x2f1/0x5d0 [bonding]
> ....
> --------------------------------------------------------
> Apr 4 18:51:32 exton kernel: swapper: page allocation failure. order:1,
> mode:0x20
> Apr 4 18:51:32 exton kernel: Pid: 0, comm: swapper Not tainted
> 2.6.32-279.19.1.el6.x86_64 #1
> Apr 4 18:51:32 exton kernel: Call Trace:
> Apr 4 18:51:32 exton kernel: <IRQ> [<ffffffff811231ff>] ?
> __alloc_pages_nodemask+0x77f/0x940
> Apr 4 18:51:32 exton kernel: [<ffffffff8115d1a2>] ? kmem_getpages+0x62/0x170
> Apr 4 18:51:32 exton kernel: [<ffffffff8115ddba>] ? fallback_alloc+0x1ba/0x270
> Apr 4 18:51:32 exton kernel: [<ffffffff8115d80f>] ? cache_grow+0x2cf/0x320
> Apr 4 18:51:32 exton kernel: [<ffffffff8115db39>] ?
> ____cache_alloc_node+0x99/0x160
> Apr 4 18:51:32 exton kernel: [<ffffffff8115ed00>] ?
> kmem_cache_alloc_node_trace+0x90/0x200
> Apr 4 18:51:32 exton kernel: [<ffffffff8115ef1d>] ? __kmalloc_node+0x4d/0x60
> Apr 4 18:51:32 exton kernel: [<ffffffff8141ea1d>] ? __alloc_skb+0x6d/0x190
> Apr 4 18:51:32 exton kernel: [<ffffffff8141eb5d>] ? dev_alloc_skb+0x1d/0x40
> Apr 4 18:51:32 exton kernel: [<ffffffffa04f5f50>] ?
> ipoib_cm_alloc_rx_skb+0x30/0x430 [ib_ipoib]
> Apr 4 18:51:32 exton kernel: [<ffffffffa04f71ef>] ?
> ipoib_cm_handle_rx_wc+0x29f/0x770 [ib_ipoib]
> Apr 4 18:51:32 exton kernel: [<ffffffffa03c6a46>] ? mlx4_ib_poll_cq+0x2c6/0x7f0
> [mlx4_ib]
> ....
> ----
This one seems a real bug/problem in
drivers/infiniband/ulp/ipoib/ipoib_cm.c
It uses :
IPOIB_CM_HEAD_SIZE = IPOIB_CM_BUF_SIZE % PAGE_SIZE,
IPOIB_CM_RX_SG = ALIGN(IPOIB_CM_BUF_SIZE, PAGE_SIZE) /
PAGE_SIZE,
but then, ipoib_cm_alloc_rx_skb() does :
skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12);
so really asking more than one page for the first frag (skb->head),
while the intent of the code was to use order-0 allocations.
for (i = 0; i < frags; i++) {
struct page *page = alloc_page(GFP_ATOMIC);
....
Ideally, IPOIB_CM_HEAD_SIZE should be redefined to use
SKB_MAX_HEAD(NET_SKB_PAD + 12)
so that skb->head would use exactly oder-0 page, not order-1 one.
Do you know understand why we should not hide allocation errors ?
On Tue, 2013-05-28 at 09:29 -0700, Eric Dumazet wrote:
> I would bump nopage_rs to somethin more reasonable, like one hour or one
> day.
Reasonable is harder to specify but perhaps it could
be made runtime configurable.
On 05/27/2013 06:41 PM, Francois Romieu wrote:
> [email protected] <[email protected]> :
> [...]
>> Failed GFP_ATOMIC allocations by the network stack result in dropped
>> packets, which will be received on a subsequent retransmit, and an
>> unnecessary, noisy warning with a kernel backtrace.
>>
>> These warnings are harmless, but they still cause users to panic and
>> file bug reports over dropped packets. It would be better to hide the
>> failed allocation warnings and backtraces, and let retransmits handle
>> dropped packets quietly.
>
> Linux VM may be perfect but device drivers do stupid things.
>
> Please don't paper over it just because some shit ends in your backyard.
It is impossible to free memory at the speed at which
10Gbit network packets can come in.
Dropped packets are a reality.
The network stack already has statistics counters to
keep track of dropped packets. There is absolutely
no reason to print out an entire kernel backtrace
for dropped network packets.
All that achieves is get people to file bug reports,
which nothing can be done about. Oh, and distract
them from whatever issue as causing their actual
problem, and delay them fixing what was going on.
On 05/28/2013 12:29 PM, Eric Dumazet wrote:
> On Tue, 2013-05-28 at 13:15 -0300, Rafael Aquini wrote:
>
>> The real problem seems to be that more and more the network stack (drivers, perhaps)
>> is relying on chunks of contiguous page-blocks without a fallback mechanism to
>> order-0 page allocations. When memory gets fragmented, these alloc failures
>> start to pop up more often and they scare ordinary sysadmins out of their paints.
>>
>
> Where do you see that ?
>
> I see exactly the opposite trend.
>
> We have less and less buggy drivers, and we want to catch last
> offenders.
These backtraces would still get printed out for drivers
that DO do the right thing and fall back to smaller
allocations.
The initial failed large allocation would cause a backtrace.