2007-11-05 17:42:26

by Frank van Maarseveen

[permalink] [raw]
Subject: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

For quite some time I'm seeing occasional lockups spread over 50 different
machines I'm maintaining. Symptom: a page allocation failure with order:1,
GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
pages, almost no swap used) followed by a lockup (everything dead). I've
collected all (12) crash cases which occurred the last 10 weeks on 50
machines total (i.e. 1 crash every 41 weeks on average). The kernel
messages are summarized to show the interesting part (IMO) they have
in common. Over the years this has become the crash cause #1 for stable
kernels for me (fglrx doesn't count ;).

One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
network driver via that same driver (netconsole) may not be the smartest
thing to do and this could be responsible for the lockup itself. However,
the initial page allocation failure remains and I'm not sure how to
address that problem.

I still think the issue is memory fragmentation but if so, it looks
a bit extreme to me: One system with 2GB of ram crashed after a day,
merely running a couple of TCP server programs. All systems have either
1 or 2GB ram and at least 1G of (merely unused) swap.


2.6.22.10: lokka kswapd0: page allocation failure. order:1, mode:0x4020
Nov 5 12:58:27 lokka __alloc_skb+0x4d/0x110
Nov 5 12:58:27 lokka __netdev_alloc_skb+0x23/0x50
Nov 5 12:58:27 lokka e1000_alloc_rx_buffers+0x127/0x320
Nov 5 12:58:27 lokka e1000_clean_rx_irq+0x299/0x510
Nov 5 12:58:27 lokka e1000_intr+0x80/0x140
Nov 5 12:58:27 lokka e1000_intr+0x80/0x140
Nov 5 12:58:27 lokka handle_IRQ_event+0x28/0x60
Nov 5 12:58:27 lokka handle_fasteoi_irq+0x6e/0xd0
Nov 5 12:58:27 lokka do_IRQ+0x4a/0xa0
...
Nov 5 12:58:27 lokka Active:139432 inactive:152030 dirty:4406 writeback:4 unstable:0
Nov 5 12:58:27 lokka free:89531 slab:122946 mapped:4421 pagetables:295 bounce:0
...
Nov 5 12:58:27 lokka Swap cache: add 16, delete 16, find 0/0, race 0+0
Nov 5 12:58:27 lokka Free swap = 1004004kB
Nov 5 12:58:27 lokka Total swap = 1004052kB
Nov 5 12:58:27 lokka Free swap: 1004004kB
Nov 5 12:58:27 lokka 523912 pages of RAM
Nov 5 12:58:27 lokka 294536 pages of HIGHMEM


2.6.21.6: somero ftxpd: page allocation failure. order:1, mode:0x20
Oct 29 11:48:07 somero __alloc_skb+0x4c/0x110
Oct 29 11:48:07 somero __netdev_alloc_skb+0x1d/0x50
Oct 29 11:48:07 somero tg3_alloc_rx_skb+0x74/0x150
Oct 29 11:48:07 somero tg3_rx+0x35a/0x440
Oct 29 11:48:07 somero tg3_poll+0x77/0x1e0
Oct 29 11:48:07 somero net_rx_action+0x99/0x190
Oct 29 11:48:07 somero __do_softirq+0x80/0xf0
Oct 29 11:48:07 somero do_softirq+0x3c/0x40
Oct 29 11:48:07 somero irq_exit+0x45/0x50
Oct 29 11:48:07 somero do_IRQ+0x50/0xa0
...
Oct 29 11:48:07 somero Active:227052 inactive:48283 dirty:3423 writeback:0 unstable:0
Oct 29 11:48:07 somero free:113487 slab:109249 mapped:177942 pagetables:645 bounce:0
...
Oct 29 11:48:07 somero Swap cache: add 428601, delete 422625, find 84672794/84685901, race 0+7
Oct 29 11:48:07 somero Free swap = 1901644kB
Oct 29 11:48:07 somero Total swap = 2008116kB
Oct 29 11:48:07 somero Free swap: 1901644kB
Oct 29 11:48:07 somero 521862 pages of RAM
Oct 29 11:48:07 somero 292486 pages of HIGHMEM


2.6.22.10: naantali kswapd0: page allocation failure. order:1, mode:0x4020
(uptime: one day, merely running a couple of TCP server programs)
Oct 26 11:27:01 naantali __alloc_skb+0x4d/0x110
Oct 26 11:27:01 naantali tcp_collapse+0x17e/0x3b0
Oct 26 11:27:01 naantali tcp_prune_queue+0x7f/0x1c0
Oct 26 11:27:01 naantali tcp_data_queue+0x487/0x720
Oct 26 11:27:01 naantali tcp_rcv_established+0x3a0/0x6e0
Oct 26 11:27:01 naantali tcp_v4_do_rcv+0xe9/0x100
Oct 26 11:27:01 naantali tcp_v4_rcv+0x7f1/0x8d0
Oct 26 11:27:01 naantali ip_local_deliver+0xef/0x250
Oct 26 11:27:01 naantali ip_rcv+0x264/0x560
Oct 26 11:27:01 naantali netif_receive_skb+0x2ad/0x320
Oct 26 11:27:01 naantali process_backlog+0x91/0x120
Oct 26 11:27:01 naantali net_rx_action+0x8d/0x170
Oct 26 11:27:01 naantali __do_softirq+0x78/0x100
Oct 26 11:27:01 naantali do_softirq+0x3c/0x40
Oct 26 11:27:01 naantali irq_exit+0x45/0x50
Oct 26 11:27:01 naantali do_IRQ+0x4f/0xa0
...
Oct 26 11:27:01 naantali Active:115770 inactive:257188 dirty:10243 writeback:1920 unstable:0
Oct 26 11:27:01 naantali free:41073 slab:97335 mapped:2540 pagetables:149 bounce:0
...
Oct 26 11:27:01 naantali Swap cache: add 90, delete 88, find 18/28, race 0+0
Oct 26 11:27:01 naantali Free swap = 2008048kB
Oct 26 11:27:01 naantali Total swap = 2008084kB
Oct 26 11:27:01 naantali Free swap: 2008048kB
Oct 26 11:27:01 naantali 524144 pages of RAM
Oct 26 11:27:01 naantali 294768 pages of HIGHMEM


2.6.21.6: koli ftxpd: page allocation failure. order:1, mode:0x20
Oct 12 14:56:44 koli __alloc_skb+0x4c/0x110
Oct 12 14:56:44 koli __netdev_alloc_skb+0x1d/0x50
Oct 12 14:56:44 koli tg3_alloc_rx_skb+0x74/0x150
Oct 12 14:56:44 koli tg3_rx+0x35a/0x440
Oct 12 14:56:44 koli tg3_poll+0x77/0x1e0
Oct 12 14:56:44 koli net_rx_action+0x99/0x190
Oct 12 14:56:44 koli __do_softirq+0x80/0xf0
Oct 12 14:56:44 koli do_softirq+0x3c/0x40
Oct 12 14:56:44 koli irq_exit+0x45/0x50
Oct 12 14:56:44 koli do_IRQ+0x50/0xa0


2.6.21.6: salla swirl: page allocation failure. order:1, mode:0x20
Oct 1 08:51:58 salla __alloc_skb+0x4c/0x110
Oct 1 08:51:58 salla __netdev_alloc_skb+0x1d/0x50
Oct 1 08:51:58 salla e1000_alloc_rx_buffers+0x140/0x340
Oct 1 08:51:58 salla e1000_clean_rx_irq+0x2c9/0x520
Oct 1 08:51:58 salla e1000_intr+0x7f/0x150
Oct 1 08:51:58 salla handle_IRQ_event+0x26/0x60
Oct 1 08:51:58 salla handle_fasteoi_irq+0x6b/0xd0
Oct 1 08:51:58 salla do_IRQ+0x4b/0xa0


2.6.22.6: lokka ftxpd: page allocation failure. order:1, mode:0x4020
Sep 17 10:34:49 lokka __alloc_skb+0x4d/0x110
Sep 17 10:34:49 lokka tcp_collapse+0x17e/0x3b0
Sep 17 10:34:49 lokka tcp_prune_queue+0x7f/0x1c0
Sep 17 10:34:49 lokka tcp_data_queue+0x487/0x720
Sep 17 10:34:49 lokka tcp_rcv_established+0x3a0/0x6e0
Sep 17 10:34:49 lokka tcp_v4_do_rcv+0xe9/0x100
Sep 17 10:34:49 lokka tcp_v4_rcv+0x7f1/0x8d0
Sep 17 10:34:49 lokka ip_local_deliver+0xef/0x250
Sep 17 10:34:49 lokka ip_rcv+0x264/0x560
Sep 17 10:34:49 lokka netif_receive_skb+0x2ad/0x320
Sep 17 10:34:49 lokka process_backlog+0x91/0x120
Sep 17 10:34:49 lokka net_rx_action+0x8d/0x170
Sep 17 10:34:49 lokka __do_softirq+0x78/0x100
Sep 17 10:34:49 lokka do_softirq+0x3c/0x40
Sep 17 10:34:49 lokka irq_exit+0x45/0x50
Sep 17 10:34:49 lokka do_IRQ+0x4f/0xa0


2.6.21.6: karvio swapper: page allocation failure. order:1, mode:0x20
Sep 17 10:48:48 karvio __alloc_skb+0x4c/0x110
Sep 17 10:48:48 karvio __netdev_alloc_skb+0x1d/0x50
Sep 17 10:48:48 karvio e1000_alloc_rx_buffers+0x140/0x340
Sep 17 10:48:48 karvio e1000_clean_rx_irq+0x2c9/0x520
Sep 17 10:48:48 karvio e1000_intr+0x7f/0x150
Sep 17 10:48:48 karvio handle_IRQ_event+0x26/0x60
Sep 17 10:48:48 karvio handle_fasteoi_irq+0x6b/0xd0
Sep 17 10:48:48 karvio do_IRQ+0x4b/0xa0


2.6.21.6: nivala ftxpd: page allocation failure. order:1, mode:0x20
Sep 20 10:32:49 nivala __alloc_skb+0x4c/0x110
Sep 20 10:32:49 nivala __netdev_alloc_skb+0x1d/0x50
Sep 20 10:32:49 nivala e1000_alloc_rx_buffers+0x140/0x340
Sep 20 10:32:49 nivala e1000_clean_rx_irq+0x2f6/0x520
Sep 20 10:32:49 nivala e1000_intr+0x7f/0x150
Sep 20 10:32:49 nivala handle_IRQ_event+0x26/0x60
Sep 20 10:32:49 nivala handle_fasteoi_irq+0x6b/0xd0
Sep 20 10:32:49 nivala do_IRQ+0x4b/0xa0


2.6.21.7: lahti ftxpd: page allocation failure. order:1, mode:0x20
Sep 3 09:46:11 lahti __alloc_skb+0x4c/0x110
Sep 3 09:46:11 lahti __netdev_alloc_skb+0x1d/0x50
Sep 3 09:46:11 lahti tg3_alloc_rx_skb+0x74/0x150
Sep 3 09:46:11 lahti tg3_rx+0x35a/0x440
Sep 3 09:46:11 lahti tg3_poll+0x77/0x1e0
Sep 3 09:46:11 lahti net_rx_action+0x99/0x190
Sep 3 09:46:11 lahti __do_softirq+0x80/0xf0
Sep 3 09:46:11 lahti do_softirq+0x3c/0x40
Sep 3 09:46:11 lahti irq_exit+0x45/0x50
Sep 3 09:46:11 lahti do_IRQ+0x50/0xa0


2.6.21.6: ropi ftxpd: page allocation failure. order:1, mode:0x20
Aug 30 10:40:46 ropi __alloc_skb+0x4c/0x110
Aug 30 10:40:46 ropi tcp_collapse+0x152/0x390
Aug 30 10:40:46 ropi tcp_prune_queue+0x84/0x1d0
Aug 30 10:40:46 ropi tcp_data_queue+0x4a6/0x760
Aug 30 10:40:46 ropi tcp_rcv_established+0x3ed/0x740
Aug 30 10:40:46 ropi tcp_v4_do_rcv+0xd0/0xe0
Aug 30 10:40:46 ropi tcp_v4_rcv+0x7d2/0x890
Aug 30 10:40:46 ropi ip_local_deliver+0xf2/0x260
Aug 30 10:40:46 ropi ip_rcv+0x259/0x510
Aug 30 10:40:46 ropi netif_receive_skb+0x280/0x2f0
Aug 30 10:40:46 ropi process_backlog+0x8d/0x140
Aug 30 10:40:46 ropi net_rx_action+0x99/0x190
Aug 30 10:40:46 ropi __do_softirq+0x80/0xf0
Aug 30 10:40:46 ropi do_softirq+0x3c/0x40
Aug 30 10:40:46 ropi irq_exit+0x45/0x50
Aug 30 10:40:46 ropi do_IRQ+0x50/0xa0


2.6.21.6: ivalo swapper: page allocation failure. order:1, mode:0x20
Aug 30 10:46:58 ivalo __alloc_skb+0x4c/0x110
Aug 30 10:46:58 ivalo __netdev_alloc_skb+0x1d/0x50
Aug 30 10:46:58 ivalo e1000_alloc_rx_buffers+0x140/0x340
Aug 30 10:46:58 ivalo e1000_clean_rx_irq+0x2c9/0x520
Aug 30 10:46:58 ivalo e1000_intr+0x7f/0x150
Aug 30 10:46:58 ivalo handle_IRQ_event+0x26/0x60
Aug 30 10:46:58 ivalo handle_fasteoi_irq+0x6b/0xd0
Aug 30 10:46:58 ivalo do_IRQ+0x4b/0xa0


2.6.21.7: lokka ftxpd: page allocation failure. order:1, mode:0x20
Aug 31 16:30:02 lokka __alloc_skb+0x4c/0x110
Aug 31 16:30:02 lokka __netdev_alloc_skb+0x1d/0x50
Aug 31 16:30:02 lokka e1000_alloc_rx_buffers+0x140/0x340
Aug 31 16:30:02 lokka e1000_clean_rx_irq+0x2c9/0x520
Aug 31 16:30:02 lokka e1000_intr+0x7f/0x150
Aug 31 16:30:02 lokka handle_IRQ_event+0x26/0x60
Aug 31 16:30:02 lokka handle_fasteoi_irq+0x6b/0xd0
Aug 31 16:30:02 lokka do_IRQ+0x4b/0xa0

--
Frank


2007-11-06 23:14:52

by Robert Hancock

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

Frank van Maarseveen wrote:
> For quite some time I'm seeing occasional lockups spread over 50 different
> machines I'm maintaining. Symptom: a page allocation failure with order:1,
> GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
> pages, almost no swap used) followed by a lockup (everything dead). I've
> collected all (12) crash cases which occurred the last 10 weeks on 50
> machines total (i.e. 1 crash every 41 weeks on average). The kernel
> messages are summarized to show the interesting part (IMO) they have
> in common. Over the years this has become the crash cause #1 for stable
> kernels for me (fglrx doesn't count ;).
>
> One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
> network driver via that same driver (netconsole) may not be the smartest
> thing to do and this could be responsible for the lockup itself. However,
> the initial page allocation failure remains and I'm not sure how to
> address that problem.
>
> I still think the issue is memory fragmentation but if so, it looks
> a bit extreme to me: One system with 2GB of ram crashed after a day,
> merely running a couple of TCP server programs. All systems have either
> 1 or 2GB ram and at least 1G of (merely unused) swap.

These are all order-1 allocations for received network packets that need
to be allocated out of low memory (assuming you're using a 32-bit
kernel), so it's quite possible for them to fail on occasion. (Are you
using jumbo frames?)

That should not be causing a lockup though.. the received packet should
just get dropped.

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-11-07 00:08:19

by Nick Piggin

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

On Tuesday 06 November 2007 04:42, Frank van Maarseveen wrote:
> For quite some time I'm seeing occasional lockups spread over 50 different
> machines I'm maintaining. Symptom: a page allocation failure with order:1,
> GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
> pages, almost no swap used) followed by a lockup (everything dead). I've
> collected all (12) crash cases which occurred the last 10 weeks on 50
> machines total (i.e. 1 crash every 41 weeks on average). The kernel
> messages are summarized to show the interesting part (IMO) they have
> in common. Over the years this has become the crash cause #1 for stable
> kernels for me (fglrx doesn't count ;).
>
> One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
> network driver via that same driver (netconsole) may not be the smartest
> thing to do and this could be responsible for the lockup itself. However,
> the initial page allocation failure remains and I'm not sure how to
> address that problem.

It isn't unexpected. If an atomic allocation doesn't have enough memory,
it kicks off kswapd to start freeing memory for it. However, it cannot
wait for memory to become free (it's GFP_ATOMIC), so it has to return
failure. GFP_ATOMIC allocation paths are designed so that the kernel can
recover from this situation, and a subsequent allocation will have free
memory.

Probably in production kernels we should default to only reporting this
when page reclaim is not making any progress.


> I still think the issue is memory fragmentation but if so, it looks
> a bit extreme to me: One system with 2GB of ram crashed after a day,
> merely running a couple of TCP server programs. All systems have either
> 1 or 2GB ram and at least 1G of (merely unused) swap.

You can reduce the chances of it happening by increasing
/proc/sys/vm/min_free_kbytes.

2007-11-07 13:48:54

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

On Wed, Nov 07, 2007 at 09:01:17AM +1100, Nick Piggin wrote:
> On Tuesday 06 November 2007 04:42, Frank van Maarseveen wrote:
> > For quite some time I'm seeing occasional lockups spread over 50 different
> > machines I'm maintaining. Symptom: a page allocation failure with order:1,
> > GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
> > pages, almost no swap used) followed by a lockup (everything dead). I've
> > collected all (12) crash cases which occurred the last 10 weeks on 50
> > machines total (i.e. 1 crash every 41 weeks on average). The kernel
> > messages are summarized to show the interesting part (IMO) they have
> > in common. Over the years this has become the crash cause #1 for stable
> > kernels for me (fglrx doesn't count ;).
> >
> > One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
> > network driver via that same driver (netconsole) may not be the smartest
> > thing to do and this could be responsible for the lockup itself. However,
> > the initial page allocation failure remains and I'm not sure how to
> > address that problem.
>
> It isn't unexpected. If an atomic allocation doesn't have enough memory,
> it kicks off kswapd to start freeing memory for it. However, it cannot
> wait for memory to become free (it's GFP_ATOMIC), so it has to return
> failure. GFP_ATOMIC allocation paths are designed so that the kernel can
> recover from this situation, and a subsequent allocation will have free
> memory.
>
> Probably in production kernels we should default to only reporting this
> when page reclaim is not making any progress.
>
>
> > I still think the issue is memory fragmentation but if so, it looks
> > a bit extreme to me: One system with 2GB of ram crashed after a day,
> > merely running a couple of TCP server programs. All systems have either
> > 1 or 2GB ram and at least 1G of (merely unused) swap.
>
> You can reduce the chances of it happening by increasing
> /proc/sys/vm/min_free_kbytes.

It's 3807 everywhere by default here which means roughly 950 pages if I
understand correctly. However, the problem occurs with much more free
pages as it seems. "grep ' free:' messages*" on the netconsole logging
machine shows:

messages:Nov 5 12:58:27 lokka free:89531 slab:122946 mapped:4421 pagetables:295 bounce:0
messages.0:Oct 29 11:48:07 somero free:113487 slab:109249 mapped:177942 pagetables:645 bounce:0
messages.1:Oct 26 11:27:01 naantali free:41073 slab:97335 mapped:2540 pagetables:149 bounce:0
messages.3:Oct 12 14:56:44 koli free:25694 slab:111842 mapped:16299 pagetables:794 bounce:0
messages.4:Oct 1 08:51:58 salla free:2964 slab:97675 mapped:279622 pagetables:650 bounce:0
messages.6:Sep 17 10:34:49 lokka free:170034 slab:95601 mapped:5548 pagetables:304 bounce:0
messages.6:Sep 17 10:48:48 karvio free:33597 slab:94909 mapped:247838 pagetables:695 bounce:0
messages.6:Sep 20 10:32:50 nivala free:80943 slab:93154 mapped:200040 pagetables:698 bounce:0
messages.8:Sep 3 09:46:11 lahti free:8195 slab:125438 mapped:2911 pagetables:192 bounce:0
messages.9:Aug 30 10:40:46 ropi free:61633 slab:90119 mapped:272634 pagetables:487 bounce:0
messages.9:Aug 30 10:46:58 ivalo free:41600 slab:88279 mapped:272705 pagetables:487 bounce:0
messages.9:Aug 31 16:30:02 lokka free:40661 slab:115006 mapped:21208 pagetables:493 bounce:0

So it is happening even with 170034 free pages (Sep 17 10:34:49),
i.e. 640M free. In this particular case the machine didn't crash but in
the majority of cases it does. Here's the full log of the 170034 free
pages case:

Sep 17 10:34:49 lokka kernel: ftxpd: page allocation failure. order:1, mode:0x4020
Sep 17 10:34:49 lokka kernel: [<c01054aa>] show_trace_log_lvl+0x1a/0x30
Sep 17 10:34:49 lokka kernel: [<c01054d2>] show_trace+0x12/0x20
Sep 17 10:34:49 lokka kernel: [<c01055f6>] dump_stack+0x16/0x20
Sep 17 10:34:49 lokka kernel: [<c0156ade>] __alloc_pages+0x27e/0x300
Sep 17 10:34:49 lokka kernel: [<c016e926>] allocate_slab+0x46/0x90
Sep 17 10:34:49 lokka kernel: [<c016e9e1>] new_slab+0x31/0x140
Sep 17 10:34:49 lokka kernel: [<c016efec>] __slab_alloc+0xbc/0x180
Sep 17 10:34:49 lokka kernel: [<c0170364>] __kmalloc_track_caller+0x74/0x80
Sep 17 10:34:49 lokka kernel: [<c04b1d8d>] __alloc_skb+0x4d/0x110
Sep 17 10:34:49 lokka kernel: [<c0500cce>] tcp_collapse+0x17e/0x3b0
Sep 17 10:34:49 lokka kernel: [<c050103f>] tcp_prune_queue+0x7f/0x1c0
Sep 17 10:34:49 lokka kernel: [<c05008b7>] tcp_data_queue+0x487/0x720
Sep 17 10:34:49 lokka kernel: [<c0501a20>] tcp_rcv_established+0x3a0/0x6e0
Sep 17 10:34:49 lokka kernel: [<c0509859>] tcp_v4_do_rcv+0xe9/0x100
Sep 17 10:34:49 lokka kernel: [<c050a061>] tcp_v4_rcv+0x7f1/0x8d0
Sep 17 10:34:49 lokka kernel: [<c04ed2bf>] ip_local_deliver+0xef/0x250
Sep 17 10:34:49 lokka kernel: [<c04ed874>] ip_rcv+0x264/0x560
Sep 17 10:34:49 lokka kernel: [<c04b905d>] netif_receive_skb+0x2ad/0x320
Sep 17 10:34:49 lokka kernel: [<c04b9161>] process_backlog+0x91/0x120
Sep 17 10:34:49 lokka kernel: [<c04b927d>] net_rx_action+0x8d/0x170
Sep 17 10:34:49 lokka kernel: [<c0128de8>] __do_softirq+0x78/0x100
Sep 17 10:34:49 lokka kernel: [<c0128eac>] do_softirq+0x3c/0x40
Sep 17 10:34:49 lokka kernel: [<c0128f15>] irq_exit+0x45/0x50
Sep 17 10:34:49 lokka kernel: [<c0106d2f>] do_IRQ+0x4f/0xa0
Sep 17 10:34:49 lokka kernel: [<c01050b3>] common_interrupt+0x23/0x30
Sep 17 10:34:49 lokka kernel: [<c01c87e3>] journal_get_write_access+0x23/0x40
Sep 17 10:34:49 lokka kernel: [<c01c749f>] __ext3_journal_get_write_access+0x1f/0x60
Sep 17 10:34:49 lokka kernel: [<c01be5ee>] ext3_reserve_inode_write+0x5e/0x80
Sep 17 10:34:49 lokka kernel: [<c01be645>] ext3_mark_inode_dirty+0x35/0x60
Sep 17 10:34:49 lokka kernel: [<c01bb421>] ext3_splice_branch+0x81/0x1f0
Sep 17 10:34:49 lokka kernel: [<c01bb76c>] ext3_get_blocks_handle+0x1dc/0x360
Sep 17 10:34:49 lokka kernel: [<c01bb966>] ext3_get_block+0x76/0xf0
Sep 17 10:34:49 lokka kernel: [<c01960ec>] __block_prepare_write+0x20c/0x430
Sep 17 10:34:49 lokka kernel: [<c0196bb8>] block_prepare_write+0x28/0x40
Sep 17 10:34:49 lokka kernel: [<c01bbe82>] ext3_prepare_write+0xe2/0x170
Sep 17 10:34:49 lokka kernel: [<c0153e2f>] generic_file_buffered_write+0x1cf/0x620
Sep 17 10:34:49 lokka kernel: [<c01544c7>] __generic_file_aio_write_nolock+0x247/0x560
Sep 17 10:34:49 lokka kernel: [<c015491a>] generic_file_aio_write+0x5a/0xd0
Sep 17 10:34:49 lokka kernel: [<c01b98ad>] ext3_file_write+0x2d/0xd0
Sep 17 10:34:49 lokka kernel: [<c0173b57>] do_sync_write+0xc7/0x120
Sep 17 10:34:49 lokka kernel: [<c0173c37>] vfs_write+0x87/0x130
Sep 17 10:34:49 lokka kernel: [<c0173d8d>] sys_write+0x3d/0x70
Sep 17 10:34:49 lokka kernel: [<c0104132>] syscall_call+0x7/0xb
Sep 17 10:34:49 lokka kernel: =======================
Sep 17 10:34:49 lokka kernel: Mem-info:
Sep 17 10:34:49 lokka kernel: DMA per-cpu:
Sep 17 10:34:49 lokka kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Sep 17 10:34:49 lokka kernel: CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Sep 17 10:34:49 lokka kernel: Normal per-cpu:
Sep 17 10:34:49 lokka kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 177 Cold: hi: 62, btch: 15 usd: 54
Sep 17 10:34:49 lokka kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 46 Cold: hi: 62, btch: 15 usd: 58
Sep 17 10:34:49 lokka kernel: HighMem per-cpu:
Sep 17 10:34:49 lokka kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 2 Cold: hi: 62, btch: 15 usd: 11
Sep 17 10:34:49 lokka kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 19 Cold: hi: 62, btch: 15 usd: 1
Sep 17 10:34:49 lokka kernel: Active:85625 inactive:152024 dirty:5701 writeback:2516 unstable:0
Sep 17 10:34:49 lokka kernel: free:170034 slab:95601 mapped:5548 pagetables:304 bounce:0
Sep 17 10:34:49 lokka kernel: DMA free:3496kB min:64kB low:80kB high:96kB active:0kB inactive:0kB present:16176kB pages_scanned:0 all_unreclaimable? no
Sep 17 10:34:49 lokka kernel: lowmem_reserve[]: 0 868 2004
Sep 17 10:34:49 lokka kernel: Normal free:39756kB min:3736kB low:4668kB high:5604kB active:236916kB inactive:175624kB present:889680kB pages_scanned:0 all_unreclaimable? no
Sep 17 10:34:49 lokka kernel: lowmem_reserve[]: 0 0 9087
Sep 17 10:34:49 lokka kernel: HighMem free:636884kB min:512kB low:1732kB high:2952kB active:105584kB inactive:432472kB present:1163188kB pages_scanned:0 all_unreclaimable? no
Sep 17 10:34:49 lokka kernel: lowmem_reserve[]: 0 0 0
Sep 17 10:34:49 lokka kernel: DMA: 58*4kB 1*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3504kB
Sep 17 10:34:49 lokka kernel: Normal: 9770*4kB 6*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 39832kB
Sep 17 10:34:49 lokka kernel: HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 717*256kB 407*512kB 139*1024kB 36*2048kB 7*4096kB = 636784kB
Sep 17 10:34:49 lokka kernel: Swap cache: add 13, delete 12, find 0/0, race 0+0
Sep 17 10:34:49 lokka kernel: Free swap = 1004000kB
Sep 17 10:34:49 lokka kernel: Total swap = 1004052kB
Sep 17 10:34:49 lokka kernel: Free swap: 1004000kB
Sep 17 10:34:49 lokka kernel: 523912 pages of RAM
Sep 17 10:34:49 lokka kernel: 294536 pages of HIGHMEM
Sep 17 10:34:49 lokka kernel: 9361 reserved pages
Sep 17 10:34:49 lokka kernel: 232017 pages shared
Sep 17 10:34:49 lokka kernel: 1 pages swap cached
Sep 17 10:34:49 lokka kernel: 5701 pages dirty
Sep 17 10:34:49 lokka kernel: 2392 pages writeback
Sep 17 10:34:49 lokka kernel: 5548 pages mapped
Sep 17 10:34:49 lokka kernel: 95392 pages slab
Sep 17 10:34:49 lokka kernel: 304 pages pagetables

--
Frank

2007-11-07 13:56:56

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

On Tue, Nov 06, 2007 at 05:13:50PM -0600, Robert Hancock wrote:
> Frank van Maarseveen wrote:
> >For quite some time I'm seeing occasional lockups spread over 50 different
> >machines I'm maintaining. Symptom: a page allocation failure with order:1,
> >GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
> >pages, almost no swap used) followed by a lockup (everything dead). I've
> >collected all (12) crash cases which occurred the last 10 weeks on 50
> >machines total (i.e. 1 crash every 41 weeks on average). The kernel
> >messages are summarized to show the interesting part (IMO) they have
> >in common. Over the years this has become the crash cause #1 for stable
> >kernels for me (fglrx doesn't count ;).
> >
> >One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
> >network driver via that same driver (netconsole) may not be the smartest
> >thing to do and this could be responsible for the lockup itself. However,
> >the initial page allocation failure remains and I'm not sure how to
> >address that problem.
> >
> >I still think the issue is memory fragmentation but if so, it looks
> >a bit extreme to me: One system with 2GB of ram crashed after a day,
> >merely running a couple of TCP server programs. All systems have either
> >1 or 2GB ram and at least 1G of (merely unused) swap.
>
> These are all order-1 allocations for received network packets that need
> to be allocated out of low memory (assuming you're using a 32-bit
> kernel), so it's quite possible for them to fail on occasion. (Are you
> using jumbo frames?)

I don't use jumbo frames.


>
> That should not be causing a lockup though.. the received packet should
> just get dropped.

Ok, packet loss is recoverable to some extend. When a system crashes
I often see a couple of page allocation failures in the same second,
all reported via netconsole. Here's the full log of such a case:

Oct 26 11:27:01 naantali kswapd0: page allocation failure. order:1, mode:0x4020
Oct 26 11:27:01 naantali [<c01054aa>]
Oct 26 11:27:01 naantali show_trace_log_lvl+0x1a/0x30
Oct 26 11:27:01 naantali [<c01054d2>]
Oct 26 11:27:01 naantali show_trace+0x12/0x20
Oct 26 11:27:01 naantali [<c01055f6>]
Oct 26 11:27:01 naantali dump_stack+0x16/0x20
Oct 26 11:27:01 naantali [<c0156ace>]
Oct 26 11:27:01 naantali __alloc_pages+0x27e/0x300
Oct 26 11:27:01 naantali [<c016e916>]
Oct 26 11:27:01 naantali allocate_slab+0x46/0x90
Oct 26 11:27:01 naantali [<c016e9d1>]
Oct 26 11:27:01 naantali new_slab+0x31/0x140
Oct 26 11:27:01 naantali [<c016efdc>]
Oct 26 11:27:01 naantali __slab_alloc+0xbc/0x180
Oct 26 11:27:01 naantali [<c0170354>]
Oct 26 11:27:01 naantali __kmalloc_track_caller+0x74/0x80
Oct 26 11:27:01 naantali [<c04b193d>]
Oct 26 11:27:01 naantali __alloc_skb+0x4d/0x110
Oct 26 11:27:01 naantali [<c050088e>]
Oct 26 11:27:01 naantali tcp_collapse+0x17e/0x3b0
Oct 26 11:27:01 naantali [<c0500bff>]
Oct 26 11:27:01 naantali tcp_prune_queue+0x7f/0x1c0
Oct 26 11:27:01 naantali [<c0500477>]
Oct 26 11:27:01 naantali tcp_data_queue+0x487/0x720
Oct 26 11:27:01 naantali [<c05015e0>]
Oct 26 11:27:01 naantali tcp_rcv_established+0x3a0/0x6e0
Oct 26 11:27:01 naantali [<c0509419>]
Oct 26 11:27:01 naantali tcp_v4_do_rcv+0xe9/0x100
Oct 26 11:27:01 naantali [<c0509c21>]
Oct 26 11:27:01 naantali tcp_v4_rcv+0x7f1/0x8d0
Oct 26 11:27:01 naantali [<c04ece8f>]
Oct 26 11:27:01 naantali ip_local_deliver+0xef/0x250
Oct 26 11:27:01 naantali [<c04ed444>]
Oct 26 11:27:01 naantali ip_rcv+0x264/0x560
Oct 26 11:27:01 naantali [<c04b8c2d>]
Oct 26 11:27:01 naantali netif_receive_skb+0x2ad/0x320
Oct 26 11:27:01 naantali [<c04b8d31>]
Oct 26 11:27:01 naantali process_backlog+0x91/0x120
Oct 26 11:27:01 naantali [<c04b8e4d>]
Oct 26 11:27:01 naantali net_rx_action+0x8d/0x170
Oct 26 11:27:01 naantali [<c0128de8>]
Oct 26 11:27:01 naantali __do_softirq+0x78/0x100
Oct 26 11:27:01 naantali [<c0128eac>]
Oct 26 11:27:01 naantali do_softirq+0x3c/0x40
Oct 26 11:27:01 naantali [<c0128f15>]
Oct 26 11:27:01 naantali irq_exit+0x45/0x50
Oct 26 11:27:01 naantali [<c0106d2f>]
Oct 26 11:27:01 naantali do_IRQ+0x4f/0xa0
Oct 26 11:27:01 naantali [<c01050b3>]
Oct 26 11:27:01 naantali common_interrupt+0x23/0x30
Oct 26 11:27:01 naantali [<c0575736>]
Oct 26 11:27:01 naantali _spin_unlock+0x16/0x20
Oct 26 11:27:01 naantali [<c0185052>]
Oct 26 11:27:01 naantali prune_dcache+0x142/0x1a0
Oct 26 11:27:01 naantali [<c01855ee>]
Oct 26 11:27:01 naantali shrink_dcache_memory+0x1e/0x50
Oct 26 11:27:01 naantali [<c015a4d9>]
Oct 26 11:27:01 naantali shrink_slab+0x139/0x1d0
Oct 26 11:27:01 naantali [<c015b980>]
Oct 26 11:27:01 naantali balance_pgdat+0x220/0x380
Oct 26 11:27:01 naantali [<c015bbb8>]
Oct 26 11:27:01 naantali kswapd+0xd8/0x140
Oct 26 11:27:01 naantali [<c0136eac>]
Oct 26 11:27:01 naantali kthread+0x5c/0xa0
Oct 26 11:27:01 naantali [<c0105337>]
Oct 26 11:27:01 naantali kernel_thread_helper+0x7/0x10
Oct 26 11:27:01 naantali =======================
Oct 26 11:27:01 naantali Mem-info:
Oct 26 11:27:01 naantali DMA per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Oct 26 11:27:01 naantali Normal per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 149 Cold: hi: 62, btch: 15 usd: 59
Oct 26 11:27:01 naantali HighMem per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 21 Cold: hi: 62, btch: 15 usd: 12
Oct 26 11:27:01 naantali Active:115770 inactive:257188 dirty:10243 writeback:1920 unstable:0
Oct 26 11:27:01 naantali free:41073 slab:97335 mapped:2540 pagetables:149 bounce:0
Oct 26 11:27:01 naantali DMA free:3488kB min:64kB low:80kB high:96kB active:0kB inactive:0kB present:16176kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 868
Oct 26 11:27:01 naantali 2005
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali Normal free:3976kB min:3736kB low:4668kB high:5604kB active:318568kB inactive:152928kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 9094
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali HighMem free:156828kB min:512kB low:1732kB high:2956kB active:144512kB inactive:875824kB present:1164104kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali last message repeated 2 times
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali DMA:
Oct 26 11:27:01 naantali 12*4kB
Oct 26 11:27:01 naantali 0*8kB
Oct 26 11:27:01 naantali 1*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 1*64kB
Oct 26 11:27:01 naantali 0*128kB
Oct 26 11:27:01 naantali 1*256kB
Oct 26 11:27:01 naantali 0*512kB
Oct 26 11:27:01 naantali 1*1024kB
Oct 26 11:27:01 naantali 1*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 3488kB
Oct 26 11:27:01 naantali Normal:
Oct 26 11:27:01 naantali 824*4kB
Oct 26 11:27:01 naantali 5*8kB
Oct 26 11:27:01 naantali 0*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 0*64kB
Oct 26 11:27:01 naantali 1*128kB
Oct 26 11:27:01 naantali 0*256kB
Oct 26 11:27:01 naantali 1*512kB
Oct 26 11:27:01 naantali 0*1024kB
Oct 26 11:27:01 naantali 0*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 4008kB
Oct 26 11:27:01 naantali HighMem:
Oct 26 11:27:01 naantali 1*4kB
Oct 26 11:27:01 naantali 0*8kB
Oct 26 11:27:01 naantali 1*16kB
Oct 26 11:27:01 naantali 0*32kB
Oct 26 11:27:01 naantali 0*64kB
Oct 26 11:27:01 naantali 1*128kB
Oct 26 11:27:01 naantali 0*256kB
Oct 26 11:27:01 naantali 0*512kB
Oct 26 11:27:01 naantali 113*1024kB
Oct 26 11:27:01 naantali 20*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 156820kB
Oct 26 11:27:01 naantali Swap cache: add 90, delete 88, find 18/28, race 0+0
Oct 26 11:27:01 naantali Free swap = 2008048kB
Oct 26 11:27:01 naantali Total swap = 2008084kB
Oct 26 11:27:01 naantali Free swap: 2008048kB
Oct 26 11:27:01 naantali 524144 pages of RAM
Oct 26 11:27:01 naantali 294768 pages of HIGHMEM
Oct 26 11:27:01 naantali 9378 reserved pages
Oct 26 11:27:01 naantali 357325 pages shared
Oct 26 11:27:01 naantali 2 pages swap cached
Oct 26 11:27:01 naantali 10243 pages dirty
Oct 26 11:27:01 naantali 1920 pages writeback
Oct 26 11:27:01 naantali 2540 pages mapped
Oct 26 11:27:01 naantali 97344 pages slab
Oct 26 11:27:01 naantali 149 pages pagetables
Oct 26 11:27:01 naantali kswapd0: page allocation failure. order:1, mode:0x4020
Oct 26 11:27:01 naantali [<c01054aa>]
Oct 26 11:27:01 naantali show_trace_log_lvl+0x1a/0x30
Oct 26 11:27:01 naantali [<c01054d2>]
Oct 26 11:27:01 naantali show_trace+0x12/0x20
Oct 26 11:27:01 naantali [<c01055f6>]
Oct 26 11:27:01 naantali dump_stack+0x16/0x20
Oct 26 11:27:01 naantali [<c0156ace>]
Oct 26 11:27:01 naantali __alloc_pages+0x27e/0x300
Oct 26 11:27:01 naantali [<c016e916>]
Oct 26 11:27:01 naantali allocate_slab+0x46/0x90
Oct 26 11:27:01 naantali [<c016e9d1>]
Oct 26 11:27:01 naantali new_slab+0x31/0x140
Oct 26 11:27:01 naantali [<c016efdc>]
Oct 26 11:27:01 naantali __slab_alloc+0xbc/0x180
Oct 26 11:27:01 naantali [<c0170354>]
Oct 26 11:27:01 naantali __kmalloc_track_caller+0x74/0x80
Oct 26 11:27:01 naantali [<c04b193d>]
Oct 26 11:27:01 naantali __alloc_skb+0x4d/0x110
Oct 26 11:27:01 naantali [<c04b1a23>]
Oct 26 11:27:01 naantali __netdev_alloc_skb+0x23/0x50
Oct 26 11:27:01 naantali [<c0328f37>]
Oct 26 11:27:01 naantali e1000_alloc_rx_buffers+0x127/0x320
Oct 26 11:27:01 naantali [<c03285b9>]
Oct 26 11:27:01 naantali e1000_clean_rx_irq+0x299/0x510
Oct 26 11:27:01 naantali [<c0327f30>]
Oct 26 11:27:01 naantali e1000_intr+0x80/0x140
Oct 26 11:27:01 naantali [<c014e3a8>]
Oct 26 11:27:01 naantali handle_IRQ_event+0x28/0x60
Oct 26 11:27:01 naantali [<c014f67e>]
Oct 26 11:27:01 naantali handle_fasteoi_irq+0x6e/0xd0
Oct 26 11:27:01 naantali [<c0106d2a>]
Oct 26 11:27:01 naantali do_IRQ+0x4a/0xa0
Oct 26 11:27:01 naantali [<c01050b3>]
Oct 26 11:27:01 naantali common_interrupt+0x23/0x30
Oct 26 11:27:01 naantali [<c057573d>]
Oct 26 11:27:01 naantali _spin_unlock+0x1d/0x20
Oct 26 11:27:01 naantali [<c0187154>]
Oct 26 11:27:01 naantali dispose_list+0x64/0xe0
Oct 26 11:27:01 naantali [<c01873f4>]
Oct 26 11:27:01 naantali prune_icache+0xd4/0x1e0
Oct 26 11:27:01 naantali [<c0187517>]
Oct 26 11:27:01 naantali shrink_icache_memory+0x17/0x40
Oct 26 11:27:01 naantali [<c015a4d9>]
Oct 26 11:27:01 naantali shrink_slab+0x139/0x1d0
Oct 26 11:27:01 naantali [<c015b980>]
Oct 26 11:27:01 naantali balance_pgdat+0x220/0x380
Oct 26 11:27:01 naantali [<c015bbb8>]
Oct 26 11:27:01 naantali kswapd+0xd8/0x140
Oct 26 11:27:01 naantali [<c0136eac>]
Oct 26 11:27:01 naantali kthread+0x5c/0xa0
Oct 26 11:27:01 naantali [<c0105337>]
Oct 26 11:27:01 naantali kernel_thread_helper+0x7/0x10
Oct 26 11:27:01 naantali =======================
Oct 26 11:27:01 naantali Mem-info:
Oct 26 11:27:01 naantali DMA per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Oct 26 11:27:01 naantali Normal per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 155 Cold: hi: 62, btch: 15 usd: 50
Oct 26 11:27:01 naantali HighMem per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 21 Cold: hi: 62, btch: 15 usd: 12
Oct 26 11:27:01 naantali Active:115692 inactive:257170 dirty:10243 writeback:1888 unstable:0
Oct 26 11:27:01 naantali free:41181 slab:97335 mapped:2540 pagetables:149 bounce:0
Oct 26 11:27:01 naantali DMA free:3488kB min:64kB low:80kB high:96kB active:0kB inactive:0kB present:16176kB pages_scanned:0 all_unreclaimable? yes
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 868
Oct 26 11:27:01 naantali 2005
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali Normal free:4408kB min:3736kB low:4668kB high:5604kB active:318256kB inactive:152856kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 9094
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali HighMem free:156828kB min:512kB low:1732kB high:2956kB active:144512kB inactive:875824kB present:1164104kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali last message repeated 2 times
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali DMA:
Oct 26 11:27:01 naantali 12*4kB
Oct 26 11:27:01 naantali 0*8kB
Oct 26 11:27:01 naantali 1*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 1*64kB
Oct 26 11:27:01 naantali 0*128kB
Oct 26 11:27:01 naantali 1*256kB
Oct 26 11:27:01 naantali 0*512kB
Oct 26 11:27:01 naantali 1*1024kB
Oct 26 11:27:01 naantali 1*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 3488kB
Oct 26 11:27:01 naantali Normal:
Oct 26 11:27:01 naantali 927*4kB
Oct 26 11:27:01 naantali 6*8kB
Oct 26 11:27:01 naantali 0*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 0*64kB
Oct 26 11:27:01 naantali 1*128kB
Oct 26 11:27:01 naantali 0*256kB
Oct 26 11:27:01 naantali 1*512kB
Oct 26 11:27:01 naantali 0*1024kB
Oct 26 11:27:01 naantali 0*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 4428kB
Oct 26 11:27:01 naantali HighMem:
Oct 26 11:27:01 naantali 1*4kB
Oct 26 11:27:01 naantali 0*8kB
Oct 26 11:27:01 naantali 1*16kB
Oct 26 11:27:01 naantali 0*32kB
Oct 26 11:27:01 naantali 0*64kB
Oct 26 11:27:01 naantali 1*128kB
Oct 26 11:27:01 naantali 0*256kB
Oct 26 11:27:01 naantali 0*512kB
Oct 26 11:27:01 naantali 113*1024kB
Oct 26 11:27:01 naantali 20*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 156820kB
Oct 26 11:27:01 naantali Swap cache: add 90, delete 88, find 18/28, race 0+0
Oct 26 11:27:01 naantali Free swap = 2008048kB
Oct 26 11:27:01 naantali Total swap = 2008084kB
Oct 26 11:27:01 naantali Free swap: 2008048kB
Oct 26 11:27:01 naantali 524144 pages of RAM
Oct 26 11:27:01 naantali 294768 pages of HIGHMEM
Oct 26 11:27:01 naantali 9378 reserved pages
Oct 26 11:27:01 naantali 357229 pages shared
Oct 26 11:27:01 naantali 2 pages swap cached
Oct 26 11:27:01 naantali 10243 pages dirty
Oct 26 11:27:01 naantali 1888 pages writeback
Oct 26 11:27:01 naantali 2540 pages mapped
Oct 26 11:27:01 naantali 97344 pages slab
Oct 26 11:27:01 naantali 149 pages pagetables
Oct 26 11:27:01 naantali kswapd0: page allocation failure. order:1, mode:0x4020
Oct 26 11:27:01 naantali [<c01054aa>]
Oct 26 11:27:01 naantali show_trace_log_lvl+0x1a/0x30
Oct 26 11:27:01 naantali [<c01054d2>]
Oct 26 11:27:01 naantali show_trace+0x12/0x20
Oct 26 11:27:01 naantali [<c01055f6>]
Oct 26 11:27:01 naantali dump_stack+0x16/0x20
Oct 26 11:27:01 naantali [<c0156ace>]
Oct 26 11:27:01 naantali __alloc_pages+0x27e/0x300
Oct 26 11:27:01 naantali [<c016e916>]
Oct 26 11:27:01 naantali allocate_slab+0x46/0x90
Oct 26 11:27:01 naantali [<c016e9d1>]
Oct 26 11:27:01 naantali new_slab+0x31/0x140
Oct 26 11:27:01 naantali [<c016efdc>]
Oct 26 11:27:01 naantali __slab_alloc+0xbc/0x180
Oct 26 11:27:01 naantali [<c0170354>]
Oct 26 11:27:01 naantali __kmalloc_track_caller+0x74/0x80
Oct 26 11:27:01 naantali [<c04b193d>]
Oct 26 11:27:01 naantali __alloc_skb+0x4d/0x110
Oct 26 11:27:01 naantali [<c04b1a23>]
Oct 26 11:27:01 naantali __netdev_alloc_skb+0x23/0x50
Oct 26 11:27:01 naantali [<c0328f37>]
Oct 26 11:27:01 naantali e1000_alloc_rx_buffers+0x127/0x320
Oct 26 11:27:01 naantali [<c0328698>]
Oct 26 11:27:01 naantali e1000_clean_rx_irq+0x378/0x510
Oct 26 11:27:01 naantali [<c0327f30>]
Oct 26 11:27:01 naantali e1000_intr+0x80/0x140
Oct 26 11:27:01 naantali [<c014e3a8>]
Oct 26 11:27:01 naantali handle_IRQ_event+0x28/0x60
Oct 26 11:27:01 naantali [<c014f67e>]
Oct 26 11:27:01 naantali handle_fasteoi_irq+0x6e/0xd0
Oct 26 11:27:01 naantali [<c0106d2a>]
Oct 26 11:27:01 naantali do_IRQ+0x4a/0xa0
Oct 26 11:27:01 naantali [<c01050b3>]
Oct 26 11:27:01 naantali common_interrupt+0x23/0x30
Oct 26 11:27:01 naantali [<c057573d>]
Oct 26 11:27:01 naantali _spin_unlock+0x1d/0x20
Oct 26 11:27:01 naantali [<c0187154>]
Oct 26 11:27:01 naantali dispose_list+0x64/0xe0
Oct 26 11:27:01 naantali [<c01873f4>]
Oct 26 11:27:01 naantali prune_icache+0xd4/0x1e0
Oct 26 11:27:01 naantali [<c0187517>]
Oct 26 11:27:01 naantali shrink_icache_memory+0x17/0x40
Oct 26 11:27:01 naantali [<c015a4d9>]
Oct 26 11:27:01 naantali shrink_slab+0x139/0x1d0
Oct 26 11:27:01 naantali [<c015b980>]
Oct 26 11:27:01 naantali balance_pgdat+0x220/0x380
Oct 26 11:27:01 naantali [<c015bbb8>]
Oct 26 11:27:01 naantali kswapd+0xd8/0x140
Oct 26 11:27:01 naantali [<c0136eac>]
Oct 26 11:27:01 naantali kthread+0x5c/0xa0
Oct 26 11:27:01 naantali [<c0105337>]
Oct 26 11:27:01 naantali kernel_thread_helper+0x7/0x10
Oct 26 11:27:01 naantali =======================
Oct 26 11:27:01 naantali Mem-info:
Oct 26 11:27:01 naantali DMA per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Oct 26 11:27:01 naantali Normal per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 155 Cold: hi: 62, btch: 15 usd: 50
Oct 26 11:27:01 naantali HighMem per-cpu:
Oct 26 11:27:01 naantali CPU 0: Hot: hi: 186, btch: 31 usd: 21 Cold: hi: 62, btch: 15 usd: 12
Oct 26 11:27:01 naantali Active:115692 inactive:257170 dirty:10243 writeback:1888 unstable:0
Oct 26 11:27:01 naantali free:41181 slab:97335 mapped:2540 pagetables:149 bounce:0
Oct 26 11:27:01 naantali DMA free:3488kB min:64kB low:80kB high:96kB active:0kB inactive:0kB present:16176kB pages_scanned:0 all_unreclaimable? yes
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 868
Oct 26 11:27:01 naantali 2005
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali Normal free:4408kB min:3736kB low:4668kB high:5604kB active:318256kB inactive:152856kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali 9094
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali HighMem free:156828kB min:512kB low:1732kB high:2956kB active:144512kB inactive:875824kB present:1164104kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali lowmem_reserve[]:
Oct 26 11:27:01 naantali 0
Oct 26 11:27:01 naantali last message repeated 2 times
Oct 26 11:27:01 naantali
Oct 26 11:27:01 naantali DMA:
Oct 26 11:27:01 naantali 12*4kB
Oct 26 11:27:01 naantali 0*8kB
Oct 26 11:27:01 naantali 1*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 1*64kB
Oct 26 11:27:01 naantali 0*128kB
Oct 26 11:27:01 naantali 1*256kB
Oct 26 11:27:01 naantali 0*512kB
Oct 26 11:27:01 naantali 1*1024kB
Oct 26 11:27:01 naantali 1*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali = 3488kB
Oct 26 11:27:01 naantali Normal:
Oct 26 11:27:01 naantali 927*4kB
Oct 26 11:27:01 naantali 6*8kB
Oct 26 11:27:01 naantali 0*16kB
Oct 26 11:27:01 naantali 1*32kB
Oct 26 11:27:01 naantali 0*64kB
Oct 26 11:27:01 naantali 1*128kB
Oct 26 11:27:01 naantali 0*256kB
Oct 26 11:27:01 naantali 1*512kB
Oct 26 11:27:01 naantali 0*1024kB
Oct 26 11:27:01 naantali 0*2048kB
Oct 26 11:27:01 naantali 0*4096kB
Oct 26 11:27:01 naantali 524144 pages of RAM
Oct 26 11:27:01 naantali 294768 pages of HIGHMEM
Oct 26 11:27:01 naantali 9378 reserved pages
Oct 26 11:27:01 naantali 357229 pages shared
Oct 26 11:27:01 naantali 2 pages swap cached
Oct 26 11:27:01 naantali 10243 pages dirty
Oct 26 11:27:01 naantali 1888 pages writeback
Oct 26 11:27:01 naantali 2540 pages mapped
Oct 26 11:27:01 naantali 97344 pages slab
Oct 26 11:27:01 naantali 149 pages pagetables
Oct 26 11:27:01 naantali __alloc_pages+0x27e/0x300
Oct 26 11:27:01 naantali kthread+0x5c/0xa0
Oct 26 11:27:01 naantali 1*2048kB
<dead>

--
Frank

2007-11-07 15:22:24

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

On Wed, Nov 07, 2007 at 02:56:45PM +0100, Frank van Maarseveen wrote:
> On Tue, Nov 06, 2007 at 05:13:50PM -0600, Robert Hancock wrote:
> > Frank van Maarseveen wrote:
> > >For quite some time I'm seeing occasional lockups spread over 50 different
> > >machines I'm maintaining. Symptom: a page allocation failure with order:1,
> > >GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free
> > >pages, almost no swap used) followed by a lockup (everything dead). I've
> > >collected all (12) crash cases which occurred the last 10 weeks on 50
> > >machines total (i.e. 1 crash every 41 weeks on average). The kernel
> > >messages are summarized to show the interesting part (IMO) they have
> > >in common. Over the years this has become the crash cause #1 for stable
> > >kernels for me (fglrx doesn't count ;).
> > >
> > >One note: I suspect that reporting a GFP_ATOMIC allocation failure in an
> > >network driver via that same driver (netconsole) may not be the smartest
> > >thing to do and this could be responsible for the lockup itself. However,
> > >the initial page allocation failure remains and I'm not sure how to
> > >address that problem.
> > >
> > >I still think the issue is memory fragmentation but if so, it looks
> > >a bit extreme to me: One system with 2GB of ram crashed after a day,
> > >merely running a couple of TCP server programs. All systems have either
> > >1 or 2GB ram and at least 1G of (merely unused) swap.
> >
> > These are all order-1 allocations for received network packets that need
> > to be allocated out of low memory (assuming you're using a 32-bit
> > kernel), so it's quite possible for them to fail on occasion. (Are you
> > using jumbo frames?)
>
> I don't use jumbo frames.
>
>
> >
> > That should not be causing a lockup though.. the received packet should
> > just get dropped.
>
> Ok, packet loss is recoverable to some extend. When a system crashes
> I often see a couple of page allocation failures in the same second,
> all reported via netconsole.

[snip]

I've grepped for 'Normal free:' assuming it is the low memory you mention to see
how it correlates. Of the 12 cases 7 did crash, 5 recovered:

Nov 5 12:58:27 lokka Normal free:6444kB min:3736kB low:4668kB high:5604kB active:235196kB inactive:104336kB present:889680kB pages_scanned:44 all_unreclaimable? no
Nov 5 12:58:27 lokka Normal free:6444kB min:3736kB low:4668kB high:5604kB active:235196kB inactive:104336kB present:889680kB pages_scanned:44 all_unreclaimable? no
Nov 5 12:58:27 lokka Normal free:6444kB min:3736kB low:4668kB high:5604kB active:235196kB inactive:104336kB present:889680kB pages_scanned:44 all_unreclaimable? no
crash

Oct 29 11:48:07 somero Normal free:5412kB min:3736kB low:4668kB high:5604kB active:288068kB inactive:105708kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 29 11:48:07 somero Normal free:6704kB min:3736kB low:4668kB high:5604kB active:287940kB inactive:105084kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 29 11:48:08 somero Normal free:8332kB min:3736kB low:4668kB high:5604kB active:287760kB inactive:104240kB present:889680kB pages_scanned:54 all_unreclaimable? no
ok (more cases with increasing free memory not received via netconsole)

Oct 26 11:27:01 naantali Normal free:3976kB min:3736kB low:4668kB high:5604kB active:318568kB inactive:152928kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali Normal free:4408kB min:3736kB low:4668kB high:5604kB active:318256kB inactive:152856kB present:889680kB pages_scanned:0 all_unreclaimable? no
Oct 26 11:27:01 naantali Normal free:4408kB min:3736kB low:4668kB high:5604kB active:318256kB inactive:152856kB present:889680kB pages_scanned:0 all_unreclaimable? no
crash

Oct 12 14:56:44 koli Normal free:11628kB min:3736kB low:4668kB high:5604kB active:238112kB inactive:157232kB present:889680kB pages_scanned:0 all_unreclaimable? no
ok

Oct 1 08:51:58 salla Normal free:5496kB min:3736kB low:4668kB high:5604kB active:409500kB inactive:46388kB present:889680kB pages_scanned:137 all_unreclaimable? no
Oct 1 08:51:59 salla Normal free:7396kB min:3736kB low:4668kB high:5604kB active:408292kB inactive:46740kB present:889680kB pages_scanned:0 all_unreclaimable? no
crash

Sep 17 10:34:49 lokka Normal free:39756kB min:3736kB low:4668kB high:5604kB active:236916kB inactive:175624kB present:889680kB pages_scanned:0 all_unreclaimable? no
ok

Sep 17 10:48:48 karvio Normal free:11648kB min:3736kB low:4668kB high:5604kB active:424420kB inactive:45380kB present:889680kB pages_scanned:144 all_unreclaimable? no
Sep 17 10:48:48 karvio Normal free:11648kB min:3736kB low:4668kB high:5604kB active:424420kB inactive:45380kB present:889680kB pages_scanned:144 all_unreclaimable? no
crash

Sep 20 10:32:50 nivala Normal free:27276kB min:3736kB low:4668kB high:5604kB active:354084kB inactive:104152kB present:889680kB pages_scanned:260 all_unreclaimable? no
crash

Sep 3 09:46:11 lahti Normal free:26200kB min:3736kB low:4668kB high:5604kB active:242088kB inactive:94900kB present:889680kB pages_scanned:0 all_unreclaimable? no
Sep 3 09:46:11 lahti Normal free:28096kB min:3736kB low:4668kB high:5604kB active:238756kB inactive:96184kB present:889680kB pages_scanned:0 all_unreclaimable? no
ok (one additional case with "Normal free:31888kB" not received via netconsole)

Aug 30 10:40:46 ropi Normal free:14372kB min:3736kB low:4668kB high:5604kB active:393508kB inactive:93644kB present:889680kB pages_scanned:0 all_unreclaimable? no
ok

Aug 30 10:46:58 ivalo Normal free:9808kB min:3736kB low:4668kB high:5604kB active:392388kB inactive:106044kB present:889680kB pages_scanned:96 all_unreclaimable? no
Aug 30 10:46:58 ivalo Normal free:12324kB min:3736kB low:4668kB high:5604kB active:390276kB inactive:105852kB present:889680kB pages_scanned:32 all_unreclaimable? no
crash

Aug 31 16:30:02 lokka Normal free:11840kB min:3736kB low:4668kB high:5604kB active:206760kB inactive:172036kB present:889680kB pages_scanned:7 all_unreclaimable? no
Aug 31 16:30:02 lokka Normal free:13268kB min:3736kB low:4668kB high:5604kB active:205824kB inactive:171976kB present:889680kB pages_scanned:0 all_unreclaimable? no
crash

I'll try "echo 40000 >/proc/sys/vm/min_free_kbytes" but I'm not sure
if it applies to all memory or only low memory and if it would make a
difference in practice.

--
Frank

2007-11-08 08:02:36

by Nick Piggin

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

On Thursday 08 November 2007 00:48, Frank van Maarseveen wrote:
> On Wed, Nov 07, 2007 at 09:01:17AM +1100, Nick Piggin wrote:
> > On Tuesday 06 November 2007 04:42, Frank van Maarseveen wrote:
> > > For quite some time I'm seeing occasional lockups spread over 50
> > > different machines I'm maintaining. Symptom: a page allocation failure
> > > with order:1, GFP_ATOMIC, while there is plenty of memory, as it seems
> > > (lots of free pages, almost no swap used) followed by a lockup
> > > (everything dead). I've collected all (12) crash cases which occurred
> > > the last 10 weeks on 50 machines total (i.e. 1 crash every 41 weeks on
> > > average). The kernel messages are summarized to show the interesting
> > > part (IMO) they have in common. Over the years this has become the
> > > crash cause #1 for stable kernels for me (fglrx doesn't count ;).
> > >
> > > One note: I suspect that reporting a GFP_ATOMIC allocation failure in
> > > an network driver via that same driver (netconsole) may not be the
> > > smartest thing to do and this could be responsible for the lockup
> > > itself. However, the initial page allocation failure remains and I'm
> > > not sure how to address that problem.
> >
> > It isn't unexpected. If an atomic allocation doesn't have enough memory,
> > it kicks off kswapd to start freeing memory for it. However, it cannot
> > wait for memory to become free (it's GFP_ATOMIC), so it has to return
> > failure. GFP_ATOMIC allocation paths are designed so that the kernel can
> > recover from this situation, and a subsequent allocation will have free
> > memory.
> >
> > Probably in production kernels we should default to only reporting this
> > when page reclaim is not making any progress.
> >
> > > I still think the issue is memory fragmentation but if so, it looks
> > > a bit extreme to me: One system with 2GB of ram crashed after a day,
> > > merely running a couple of TCP server programs. All systems have either
> > > 1 or 2GB ram and at least 1G of (merely unused) swap.
> >
> > You can reduce the chances of it happening by increasing
> > /proc/sys/vm/min_free_kbytes.
>
> It's 3807 everywhere by default here which means roughly 950 pages if I
> understand correctly. However, the problem occurs with much more free
> pages as it seems. "grep ' free:' messages*" on the netconsole logging
> machine shows:

But it's an order-1 allocation, which may not be available due to
fragmentation. Although you might have large amounts of memory free
at a given point, fragmentation can be triggered earlier when free
memory gets very low (because order-0 allocations may have taken up
all of the free order-1 pages).

Increasing it is known to help. Although you shouldn't crash due to
allocation failures... it would be nice if you could connect a serial
or vga console and see what's happening...

2007-11-08 09:08:53

by David Miller

[permalink] [raw]
Subject: Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

From: Nick Piggin <[email protected]>
Date: Thu, 8 Nov 2007 16:55:38 +1100

> But it's an order-1 allocation, which may not be available due to
> fragmentation.

If the user is using an MTU of only 1500 (and thus not using
jumbo frames) he may be being bitten by a bug that was recently
fixed by Herbert Xu that made us use order > 0 pages accidently.

His fix is below:

commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0
Author: Herbert Xu <[email protected]>
Date: Sun Oct 21 16:27:46 2007 -0700

[NET]: Fix SKB_WITH_OVERHEAD calculation

The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
an overflow across a page boundary which is what it's meant to prevent.
In particular, the header length (X) should not be lumped together with
skb_shared_info. The latter needs to be aligned properly while the header
has no choice but to sit in front of wherever the payload is.

Therefore the correct calculation is to take away the aligned size of
skb_shared_info, and then subtract the header length. The resulting
quantity L satisfies the following inequality:

SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE

This is the quantity used by alloc_skb to do the actual allocation.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f93f22b..369f60a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -41,8 +41,7 @@
#define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \
~(SMP_CACHE_BYTES - 1))
#define SKB_WITH_OVERHEAD(X) \
- (((X) - sizeof(struct skb_shared_info)) & \
- ~(SMP_CACHE_BYTES - 1))
+ ((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
#define SKB_MAX_ORDER(X, ORDER) \
SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
#define SKB_MAX_HEAD(X) (SKB_MAX_ORDER((X), 0))