2009-04-24 10:09:56

by Zeno Davatz

[permalink] [raw]
Subject: Kernel 2.6.29 runs out of memory and hangs.

Dear All

Our Kernel-Version:

Linux thinpower 2.6.29 #5 SMP Sun Apr 5 20:04:08 UTC 2009 x86_64
Intel(R) Xeon(R) CPU E5450 @ 3.00GHz GenuineIntel GNU/Linux

This morning our Server ran out of memory and had to be hard rebooted.
The system hast 32 GB or memory and 4 QuadCore Xeon CPUs.
/var/log/kernel/current tells us the following:

Apr 24 09:01:06 thinpower [1349923.491914] postgres invoked
oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Apr 24 09:01:06 thinpower [1349923.491919] postgres cpuset=/ mems_allowed=0
Apr 24 09:01:06 thinpower [1349923.491922] Pid: 2393, comm: postgres
Not tainted 2.6.29 #5
Apr 24 09:01:06 thinpower [1349923.491924] Call Trace:
Apr 24 09:01:06 thinpower [1349923.491934] [<ffffffff8106eda9>] ?
cpuset_print_task_mems_allowed+0x99/0x9f
Apr 24 09:01:06 thinpower [1349923.491939] [<ffffffff81083d28>]
oom_kill_process+0x96/0x246
Apr 24 09:01:06 thinpower [1349923.491942] [<ffffffff8106dcf0>] ?
cpuset_mems_allowed_intersects+0x1c/0x1e
Apr 24 09:01:06 thinpower [1349923.491944] [<ffffffff810840a5>] ?
badness+0x1a3/0x1e6
Apr 24 09:01:06 thinpower [1349923.491946] [<ffffffff8108421c>]
__out_of_memory+0x134/0x14b
Apr 24 09:01:06 thinpower [1349923.491949] [<ffffffff8108438b>]
out_of_memory+0x158/0x18a
Apr 24 09:01:06 thinpower [1349923.491951] [<ffffffff81087906>]
__alloc_pages_internal+0x372/0x434
Apr 24 09:01:06 thinpower [1349923.491954] [<ffffffff810aa490>]
alloc_pages_current+0xb9/0xc2
Apr 24 09:01:06 thinpower [1349923.491957] [<ffffffff81081a59>]
__page_cache_alloc+0x67/0x6b
Apr 24 09:01:06 thinpower [1349923.491959] [<ffffffff81089426>]
__do_page_cache_readahead+0x96/0x192
Apr 24 09:01:06 thinpower [1349923.491961] [<ffffffff81089575>]
do_page_cache_readahead+0x53/0x60
Apr 24 09:01:06 thinpower [1349923.491963] [<ffffffff81081ec7>]
filemap_fault+0x15e/0x313
Apr 24 09:01:06 thinpower [1349923.491967] [<ffffffff81094451>]
__do_fault+0x53/0x393
Apr 24 09:01:06 thinpower [1349923.491969] [<ffffffff81030531>] ?
__wake_up_sync+0x45/0x4e
Apr 24 09:01:06 thinpower [1349923.491972] [<ffffffff81096702>]
handle_mm_fault+0x36b/0x854
Apr 24 09:01:06 thinpower [1349923.491976] [<ffffffff813a1acb>] ?
release_sock+0xb0/0xbb
Apr 24 09:01:06 thinpower [1349923.491979] [<ffffffff81022c8f>] ?
default_spin_lock_flags+0x9/0xe
Apr 24 09:01:06 thinpower [1349923.491984] [<ffffffff814ac6fe>]
do_page_fault+0x662/0xa5c
Apr 24 09:01:06 thinpower [1349923.491988] [<ffffffff81400b8e>] ?
inet_sendmsg+0x46/0x53
Apr 24 09:01:06 thinpower [1349923.491992] [<ffffffff8139ed97>] ?
__sock_sendmsg+0x59/0x62
Apr 24 09:01:06 thinpower [1349923.491994] [<ffffffff8139f6d8>] ?
sock_sendmsg+0xc7/0xe0
Apr 24 09:01:06 thinpower [1349923.491996] [<ffffffff81086b5e>] ?
free_pages+0x32/0x36
Apr 24 09:01:06 thinpower [1349923.492000] [<ffffffff8104e7ec>] ?
autoremove_wake_function+0x0/0x38
Apr 24 09:01:06 thinpower [1349923.492004] [<ffffffff810c1d76>] ?
core_sys_select+0x1df/0x213
Apr 24 09:01:06 thinpower [1349923.492007] [<ffffffff810b2cc2>] ?
nameidata_to_filp+0x41/0x52
Apr 24 09:01:06 thinpower [1349923.492009] [<ffffffff8139f028>] ?
sockfd_lookup_light+0x1b/0x54
Apr 24 09:01:06 thinpower [1349923.492013] [<ffffffff81011ac8>] ?
read_tsc+0xe/0x24
Apr 24 09:01:06 thinpower [1349923.492017] [<ffffffff81055990>] ?
getnstimeofday+0x58/0xb4
Apr 24 09:01:06 thinpower [1349923.492019] [<ffffffff810516ed>] ?
ktime_get_ts+0x49/0x4e
Apr 24 09:01:06 thinpower [1349923.492022] [<ffffffff810c0d35>] ?
poll_select_copy_remaining+0xc5/0xea
Apr 24 09:01:06 thinpower [1349923.492023] [<ffffffff810c1ff3>] ?
sys_select+0xa7/0xbc
Apr 24 09:01:06 thinpower [1349923.492026] [<ffffffff814aa305>]
page_fault+0x25/0x30

Before that we had this:

Apr 24 08:59:24 thinpower [1349806.982419] 6954 total pagecache pages
Apr 24 08:59:24 thinpower [1349806.982420] 0 pages in swap cache
Apr 24 08:59:24 thinpower [1349806.982422] Swap cache stats: add 0,
delete 0, find 0/0
Apr 24 08:59:24 thinpower [1349806.982423] Free swap = 0kB
Apr 24 08:59:24 thinpower [1349806.982423] Total swap = 0kB
Apr 24 08:59:24 thinpower [1349807.182725] 8388592 pages RAM
Apr 24 08:59:24 thinpower [1349807.182728] 136791 pages reserved
Apr 24 08:59:24 thinpower [1349807.182729] 35053 pages shared
Apr 24 08:59:24 thinpower [1349807.182729] 8207346 pages non-shared
Apr 24 08:59:24 thinpower [1349807.182732] Out of memory: kill process
21592 (ruby) score 109489 or a child
Apr 24 08:59:24 thinpower [1349807.182795] Killed process 21592 (ruby)
Apr 24 09:00:59 thinpower [1349916.949909] apache2 invoked oom-killer:
gfp_mask=0x1201d2, order=0, oomkilladj=0
Apr 24 09:01:00 thinpower [1349916.949915] apache2 cpuset=/ mems_allowed=0
Apr 24 09:01:00 thinpower [1349916.949918] Pid: 21599, comm: apache2
Not tainted 2.6.29 #5
Apr 24 09:01:00 thinpower [1349916.949920] Call Trace:
Apr 24 09:01:00 thinpower [1349916.949930] [<ffffffff8106eda9>] ?
cpuset_print_task_mems_allowed+0x99/0x9f
Apr 24 09:01:00 thinpower [1349916.949936] [<ffffffff81083d28>]
oom_kill_process+0x96/0x246
Apr 24 09:01:00 thinpower [1349916.949938] [<ffffffff8106dcf0>] ?
cpuset_mems_allowed_intersects+0x1c/0x1e
Apr 24 09:01:00 thinpower [1349916.949940] [<ffffffff810840a5>] ?
badness+0x1a3/0x1e6
Apr 24 09:01:00 thinpower [1349916.949942] [<ffffffff8108421c>]
__out_of_memory+0x134/0x14b
Apr 24 09:01:00 thinpower [1349916.949945] [<ffffffff8108438b>]
out_of_memory+0x158/0x18a
Apr 24 09:01:00 thinpower [1349916.949947] [<ffffffff81087906>]
__alloc_pages_internal+0x372/0x434
Apr 24 09:01:00 thinpower [1349916.949950] [<ffffffff810aa490>]
alloc_pages_current+0xb9/0xc2
Apr 24 09:01:00 thinpower [1349916.949953] [<ffffffff81081a59>]
__page_cache_alloc+0x67/0x6b
Apr 24 09:01:00 thinpower [1349916.949954] [<ffffffff81081841>] ?
sync_page+0x0/0x40
Apr 24 09:01:00 thinpower [1349916.949957] [<ffffffff81089426>]
__do_page_cache_readahead+0x96/0x192
Apr 24 09:01:00 thinpower [1349916.949959] [<ffffffff81089575>]
do_page_cache_readahead+0x53/0x60
Apr 24 09:01:00 thinpower [1349916.949961] [<ffffffff81081ec7>]
filemap_fault+0x15e/0x313
Apr 24 09:01:00 thinpower [1349916.949964] [<ffffffff81094451>]
__do_fault+0x53/0x393
Apr 24 09:01:00 thinpower [1349916.949967] [<ffffffff81096702>]
handle_mm_fault+0x36b/0x854
Apr 24 09:01:00 thinpower [1349916.949971] [<ffffffff8100cf6e>] ?
apic_timer_interrupt+0xe/0x20
Apr 24 09:01:00 thinpower [1349916.949974] [<ffffffff81022c8f>] ?
default_spin_lock_flags+0x9/0xe
Apr 24 09:01:00 thinpower [1349916.949976] [<ffffffff81022c8f>] ?
default_spin_lock_flags+0x9/0xe
Apr 24 09:01:00 thinpower [1349916.949981] [<ffffffff814ac6fe>]
do_page_fault+0x662/0xa5c
Apr 24 09:01:00 thinpower [1349916.949985] [<ffffffff813a1acb>] ?
release_sock+0xb0/0xbb
Apr 24 09:01:00 thinpower [1349916.949989] [<ffffffff813e6628>] ?
tcp_recvmsg+0x720/0x833
Apr 24 09:01:00 thinpower [1349916.949991] [<ffffffff813a0eb6>] ?
sock_common_recvmsg+0x32/0x47
Apr 24 09:01:00 thinpower [1349916.949995] [<ffffffff81198a02>] ?
selinux_socket_recvmsg+0x1d/0x1f
Apr 24 09:01:00 thinpower [1349916.949999] [<ffffffff8139ef03>] ?
__sock_recvmsg+0x6d/0x79
Apr 24 09:01:00 thinpower [1349916.950001] [<ffffffff8139effd>] ?
sock_aio_read+0xee/0xfe
Apr 24 09:01:00 thinpower [1349916.950005] [<ffffffff810b42b3>] ?
do_sync_read+0xe7/0x12d
Apr 24 09:01:00 thinpower [1349916.950008] [<ffffffff813f22bd>] ?
tcp_write_xmit+0x453/0x913

Thank your for Feedback.

Best regards
Zeno Davatz


2009-04-24 17:17:47

by David Rientjes

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Fri, 24 Apr 2009, Zeno Davatz wrote:

> Dear All
>
> Our Kernel-Version:
>
> Linux thinpower 2.6.29 #5 SMP Sun Apr 5 20:04:08 UTC 2009 x86_64
> Intel(R) Xeon(R) CPU E5450 @ 3.00GHz GenuineIntel GNU/Linux
>
> This morning our Server ran out of memory and had to be hard rebooted.
> The system hast 32 GB or memory and 4 QuadCore Xeon CPUs.
> /var/log/kernel/current tells us the following:
>
> Apr 24 09:01:06 thinpower [1349923.491914] postgres invoked
> oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
> Apr 24 09:01:06 thinpower [1349923.491919] postgres cpuset=/ mems_allowed=0
> Apr 24 09:01:06 thinpower [1349923.491922] Pid: 2393, comm: postgres
> Not tainted 2.6.29 #5
> Apr 24 09:01:06 thinpower [1349923.491924] Call Trace:
> Apr 24 09:01:06 thinpower [1349923.491934] [<ffffffff8106eda9>] ?
> cpuset_print_task_mems_allowed+0x99/0x9f
> Apr 24 09:01:06 thinpower [1349923.491939] [<ffffffff81083d28>]
> oom_kill_process+0x96/0x246
> Apr 24 09:01:06 thinpower [1349923.491942] [<ffffffff8106dcf0>] ?
> cpuset_mems_allowed_intersects+0x1c/0x1e
> Apr 24 09:01:06 thinpower [1349923.491944] [<ffffffff810840a5>] ?
> badness+0x1a3/0x1e6
> Apr 24 09:01:06 thinpower [1349923.491946] [<ffffffff8108421c>]
> __out_of_memory+0x134/0x14b
> Apr 24 09:01:06 thinpower [1349923.491949] [<ffffffff8108438b>]
> out_of_memory+0x158/0x18a
> Apr 24 09:01:06 thinpower [1349923.491951] [<ffffffff81087906>]
> __alloc_pages_internal+0x372/0x434
> Apr 24 09:01:06 thinpower [1349923.491954] [<ffffffff810aa490>]
> alloc_pages_current+0xb9/0xc2
> Apr 24 09:01:06 thinpower [1349923.491957] [<ffffffff81081a59>]
> __page_cache_alloc+0x67/0x6b
> Apr 24 09:01:06 thinpower [1349923.491959] [<ffffffff81089426>]
> __do_page_cache_readahead+0x96/0x192
> Apr 24 09:01:06 thinpower [1349923.491961] [<ffffffff81089575>]
> do_page_cache_readahead+0x53/0x60
> Apr 24 09:01:06 thinpower [1349923.491963] [<ffffffff81081ec7>]
> filemap_fault+0x15e/0x313
> Apr 24 09:01:06 thinpower [1349923.491967] [<ffffffff81094451>]
> __do_fault+0x53/0x393
> Apr 24 09:01:06 thinpower [1349923.491969] [<ffffffff81030531>] ?
> __wake_up_sync+0x45/0x4e
> Apr 24 09:01:06 thinpower [1349923.491972] [<ffffffff81096702>]
> handle_mm_fault+0x36b/0x854
> Apr 24 09:01:06 thinpower [1349923.491976] [<ffffffff813a1acb>] ?
> release_sock+0xb0/0xbb
> Apr 24 09:01:06 thinpower [1349923.491979] [<ffffffff81022c8f>] ?
> default_spin_lock_flags+0x9/0xe
> Apr 24 09:01:06 thinpower [1349923.491984] [<ffffffff814ac6fe>]
> do_page_fault+0x662/0xa5c
> Apr 24 09:01:06 thinpower [1349923.491988] [<ffffffff81400b8e>] ?
> inet_sendmsg+0x46/0x53
> Apr 24 09:01:06 thinpower [1349923.491992] [<ffffffff8139ed97>] ?
> __sock_sendmsg+0x59/0x62
> Apr 24 09:01:06 thinpower [1349923.491994] [<ffffffff8139f6d8>] ?
> sock_sendmsg+0xc7/0xe0
> Apr 24 09:01:06 thinpower [1349923.491996] [<ffffffff81086b5e>] ?
> free_pages+0x32/0x36
> Apr 24 09:01:06 thinpower [1349923.492000] [<ffffffff8104e7ec>] ?
> autoremove_wake_function+0x0/0x38
> Apr 24 09:01:06 thinpower [1349923.492004] [<ffffffff810c1d76>] ?
> core_sys_select+0x1df/0x213
> Apr 24 09:01:06 thinpower [1349923.492007] [<ffffffff810b2cc2>] ?
> nameidata_to_filp+0x41/0x52
> Apr 24 09:01:06 thinpower [1349923.492009] [<ffffffff8139f028>] ?
> sockfd_lookup_light+0x1b/0x54
> Apr 24 09:01:06 thinpower [1349923.492013] [<ffffffff81011ac8>] ?
> read_tsc+0xe/0x24
> Apr 24 09:01:06 thinpower [1349923.492017] [<ffffffff81055990>] ?
> getnstimeofday+0x58/0xb4
> Apr 24 09:01:06 thinpower [1349923.492019] [<ffffffff810516ed>] ?
> ktime_get_ts+0x49/0x4e
> Apr 24 09:01:06 thinpower [1349923.492022] [<ffffffff810c0d35>] ?
> poll_select_copy_remaining+0xc5/0xea
> Apr 24 09:01:06 thinpower [1349923.492023] [<ffffffff810c1ff3>] ?
> sys_select+0xa7/0xbc
> Apr 24 09:01:06 thinpower [1349923.492026] [<ffffffff814aa305>]
> page_fault+0x25/0x30
>

Are these the last messages in the dmesg? There should be subsequent
information that describes the current state of memory following the stack
trace.

If this is reproducible, I'd recommend enabling
/proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
and show us what may be causing the livelock.

> Before that we had this:
>
> Apr 24 08:59:24 thinpower [1349806.982419] 6954 total pagecache pages
> Apr 24 08:59:24 thinpower [1349806.982420] 0 pages in swap cache
> Apr 24 08:59:24 thinpower [1349806.982422] Swap cache stats: add 0,
> delete 0, find 0/0
> Apr 24 08:59:24 thinpower [1349806.982423] Free swap = 0kB
> Apr 24 08:59:24 thinpower [1349806.982423] Total swap = 0kB
> Apr 24 08:59:24 thinpower [1349807.182725] 8388592 pages RAM
> Apr 24 08:59:24 thinpower [1349807.182728] 136791 pages reserved
> Apr 24 08:59:24 thinpower [1349807.182729] 35053 pages shared
> Apr 24 08:59:24 thinpower [1349807.182729] 8207346 pages non-shared
> Apr 24 08:59:24 thinpower [1349807.182732] Out of memory: kill process
> 21592 (ruby) score 109489 or a child
> Apr 24 08:59:24 thinpower [1349807.182795] Killed process 21592 (ruby)
> Apr 24 09:00:59 thinpower [1349916.949909] apache2 invoked oom-killer:
> gfp_mask=0x1201d2, order=0, oomkilladj=0

apache2 in this case (and postgres later) are valid oom killer targets
because their /proc/pid/oom_adj score is 0. The only reason they wouldn't
be getting killed subsequently is because the already killed tasks, ruby,
is PF_EXITING and has failed to exit. It has access to memory reserves
and if it hangs before dying, the machine will livelock.

> Apr 24 09:01:00 thinpower [1349916.949915] apache2 cpuset=/ mems_allowed=0
> Apr 24 09:01:00 thinpower [1349916.949918] Pid: 21599, comm: apache2
> Not tainted 2.6.29 #5
> Apr 24 09:01:00 thinpower [1349916.949920] Call Trace:
> Apr 24 09:01:00 thinpower [1349916.949930] [<ffffffff8106eda9>] ?
> cpuset_print_task_mems_allowed+0x99/0x9f
> Apr 24 09:01:00 thinpower [1349916.949936] [<ffffffff81083d28>]
> oom_kill_process+0x96/0x246
> Apr 24 09:01:00 thinpower [1349916.949938] [<ffffffff8106dcf0>] ?
> cpuset_mems_allowed_intersects+0x1c/0x1e
> Apr 24 09:01:00 thinpower [1349916.949940] [<ffffffff810840a5>] ?
> badness+0x1a3/0x1e6
> Apr 24 09:01:00 thinpower [1349916.949942] [<ffffffff8108421c>]
> __out_of_memory+0x134/0x14b
> Apr 24 09:01:00 thinpower [1349916.949945] [<ffffffff8108438b>]
> out_of_memory+0x158/0x18a
> Apr 24 09:01:00 thinpower [1349916.949947] [<ffffffff81087906>]
> __alloc_pages_internal+0x372/0x434
> Apr 24 09:01:00 thinpower [1349916.949950] [<ffffffff810aa490>]
> alloc_pages_current+0xb9/0xc2
> Apr 24 09:01:00 thinpower [1349916.949953] [<ffffffff81081a59>]
> __page_cache_alloc+0x67/0x6b
> Apr 24 09:01:00 thinpower [1349916.949954] [<ffffffff81081841>] ?
> sync_page+0x0/0x40
> Apr 24 09:01:00 thinpower [1349916.949957] [<ffffffff81089426>]
> __do_page_cache_readahead+0x96/0x192
> Apr 24 09:01:00 thinpower [1349916.949959] [<ffffffff81089575>]
> do_page_cache_readahead+0x53/0x60
> Apr 24 09:01:00 thinpower [1349916.949961] [<ffffffff81081ec7>]
> filemap_fault+0x15e/0x313
> Apr 24 09:01:00 thinpower [1349916.949964] [<ffffffff81094451>]
> __do_fault+0x53/0x393
> Apr 24 09:01:00 thinpower [1349916.949967] [<ffffffff81096702>]
> handle_mm_fault+0x36b/0x854
> Apr 24 09:01:00 thinpower [1349916.949971] [<ffffffff8100cf6e>] ?
> apic_timer_interrupt+0xe/0x20
> Apr 24 09:01:00 thinpower [1349916.949974] [<ffffffff81022c8f>] ?
> default_spin_lock_flags+0x9/0xe
> Apr 24 09:01:00 thinpower [1349916.949976] [<ffffffff81022c8f>] ?
> default_spin_lock_flags+0x9/0xe
> Apr 24 09:01:00 thinpower [1349916.949981] [<ffffffff814ac6fe>]
> do_page_fault+0x662/0xa5c
> Apr 24 09:01:00 thinpower [1349916.949985] [<ffffffff813a1acb>] ?
> release_sock+0xb0/0xbb
> Apr 24 09:01:00 thinpower [1349916.949989] [<ffffffff813e6628>] ?
> tcp_recvmsg+0x720/0x833
> Apr 24 09:01:00 thinpower [1349916.949991] [<ffffffff813a0eb6>] ?
> sock_common_recvmsg+0x32/0x47
> Apr 24 09:01:00 thinpower [1349916.949995] [<ffffffff81198a02>] ?
> selinux_socket_recvmsg+0x1d/0x1f
> Apr 24 09:01:00 thinpower [1349916.949999] [<ffffffff8139ef03>] ?
> __sock_recvmsg+0x6d/0x79
> Apr 24 09:01:00 thinpower [1349916.950001] [<ffffffff8139effd>] ?
> sock_aio_read+0xee/0xfe
> Apr 24 09:01:00 thinpower [1349916.950005] [<ffffffff810b42b3>] ?
> do_sync_read+0xe7/0x12d
> Apr 24 09:01:00 thinpower [1349916.950008] [<ffffffff813f22bd>] ?
> tcp_write_xmit+0x453/0x913
>

2009-04-24 18:13:22

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

Thank you for your reply, as following:

On Fri, Apr 24, 2009 at 7:17 PM, David Rientjes <[email protected]> wrote:
> On Fri, 24 Apr 2009, Zeno Davatz wrote:
>> Linux thinpower 2.6.29 #5 SMP Sun Apr 5 20:04:08 UTC 2009 x86_64
>> Intel(R) Xeon(R) CPU E5450 @ 3.00GHz GenuineIntel GNU/Linux
>>
>> This morning our Server ran out of memory and had to be hard rebooted.
>> The system hast 32 GB or memory and 4 QuadCore Xeon CPUs.
>> /var/log/kernel/current tells us the following:
>>
>> Apr 24 09:01:06 thinpower [1349923.491914] postgres invoked
>> oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
>> Apr 24 09:01:06 thinpower [1349923.491919] postgres cpuset=/ mems_allowed=0
>> Apr 24 09:01:06 thinpower [1349923.491922] Pid: 2393, comm: postgres
>> Not tainted 2.6.29 #5
>> Apr 24 09:01:06 thinpower [1349923.491924] Call Trace:
>> Apr 24 09:01:06 thinpower [1349923.491934] ?[<ffffffff8106eda9>] ?
>> cpuset_print_task_mems_allowed+0x99/0x9f
>> Apr 24 09:01:06 thinpower [1349923.491939] ?[<ffffffff81083d28>]
>> oom_kill_process+0x96/0x246
>> Apr 24 09:01:06 thinpower [1349923.491942] ?[<ffffffff8106dcf0>] ?
>> cpuset_mems_allowed_intersects+0x1c/0x1e
>> Apr 24 09:01:06 thinpower [1349923.491944] ?[<ffffffff810840a5>] ?
>> badness+0x1a3/0x1e6
>> Apr 24 09:01:06 thinpower [1349923.491946] ?[<ffffffff8108421c>]
>> __out_of_memory+0x134/0x14b
>> Apr 24 09:01:06 thinpower [1349923.491949] ?[<ffffffff8108438b>]
>> out_of_memory+0x158/0x18a
>> Apr 24 09:01:06 thinpower [1349923.491951] ?[<ffffffff81087906>]
>> __alloc_pages_internal+0x372/0x434
>> Apr 24 09:01:06 thinpower [1349923.491954] ?[<ffffffff810aa490>]
>> alloc_pages_current+0xb9/0xc2
>> Apr 24 09:01:06 thinpower [1349923.491957] ?[<ffffffff81081a59>]
>> __page_cache_alloc+0x67/0x6b
>> Apr 24 09:01:06 thinpower [1349923.491959] ?[<ffffffff81089426>]
>> __do_page_cache_readahead+0x96/0x192
>> Apr 24 09:01:06 thinpower [1349923.491961] ?[<ffffffff81089575>]
>> do_page_cache_readahead+0x53/0x60
>> Apr 24 09:01:06 thinpower [1349923.491963] ?[<ffffffff81081ec7>]
>> filemap_fault+0x15e/0x313
>> Apr 24 09:01:06 thinpower [1349923.491967] ?[<ffffffff81094451>]
>> __do_fault+0x53/0x393
>> Apr 24 09:01:06 thinpower [1349923.491969] ?[<ffffffff81030531>] ?
>> __wake_up_sync+0x45/0x4e
>> Apr 24 09:01:06 thinpower [1349923.491972] ?[<ffffffff81096702>]
>> handle_mm_fault+0x36b/0x854
>> Apr 24 09:01:06 thinpower [1349923.491976] ?[<ffffffff813a1acb>] ?
>> release_sock+0xb0/0xbb
>> Apr 24 09:01:06 thinpower [1349923.491979] ?[<ffffffff81022c8f>] ?
>> default_spin_lock_flags+0x9/0xe
>> Apr 24 09:01:06 thinpower [1349923.491984] ?[<ffffffff814ac6fe>]
>> do_page_fault+0x662/0xa5c
>> Apr 24 09:01:06 thinpower [1349923.491988] ?[<ffffffff81400b8e>] ?
>> inet_sendmsg+0x46/0x53
>> Apr 24 09:01:06 thinpower [1349923.491992] ?[<ffffffff8139ed97>] ?
>> __sock_sendmsg+0x59/0x62
>> Apr 24 09:01:06 thinpower [1349923.491994] ?[<ffffffff8139f6d8>] ?
>> sock_sendmsg+0xc7/0xe0
>> Apr 24 09:01:06 thinpower [1349923.491996] ?[<ffffffff81086b5e>] ?
>> free_pages+0x32/0x36
>> Apr 24 09:01:06 thinpower [1349923.492000] ?[<ffffffff8104e7ec>] ?
>> autoremove_wake_function+0x0/0x38
>> Apr 24 09:01:06 thinpower [1349923.492004] ?[<ffffffff810c1d76>] ?
>> core_sys_select+0x1df/0x213
>> Apr 24 09:01:06 thinpower [1349923.492007] ?[<ffffffff810b2cc2>] ?
>> nameidata_to_filp+0x41/0x52
>> Apr 24 09:01:06 thinpower [1349923.492009] ?[<ffffffff8139f028>] ?
>> sockfd_lookup_light+0x1b/0x54
>> Apr 24 09:01:06 thinpower [1349923.492013] ?[<ffffffff81011ac8>] ?
>> read_tsc+0xe/0x24
>> Apr 24 09:01:06 thinpower [1349923.492017] ?[<ffffffff81055990>] ?
>> getnstimeofday+0x58/0xb4
>> Apr 24 09:01:06 thinpower [1349923.492019] ?[<ffffffff810516ed>] ?
>> ktime_get_ts+0x49/0x4e
>> Apr 24 09:01:06 thinpower [1349923.492022] ?[<ffffffff810c0d35>] ?
>> poll_select_copy_remaining+0xc5/0xea
>> Apr 24 09:01:06 thinpower [1349923.492023] ?[<ffffffff810c1ff3>] ?
>> sys_select+0xa7/0xbc
>> Apr 24 09:01:06 thinpower [1349923.492026] ?[<ffffffff814aa305>]
>> page_fault+0x25/0x30
>>
>
> Are these the last messages in the dmesg? ?There should be subsequent
> information that describes the current state of memory following the stack
> trace.

These are the last messages from /var/log/kernel/current - [[ let me
know if you need more ]]

Apr 24 09:01:00 thinpower [1349916.950049] CPU 0: hi: 186, btch:
31 usd: 110
Apr 24 09:01:00 thinpower [1349916.950051] CPU 1: hi: 186, btch:
31 usd: 111
Apr 24 09:01:00 thinpower [1349916.950052] CPU 2: hi: 186, btch:
31 usd: 162
Apr 24 09:01:00 thinpower [1349916.950053] CPU 3: hi: 186, btch:
31 usd: 157
Apr 24 09:01:00 thinpower [1349916.950055] CPU 4: hi: 186, btch:
31 usd: 109
Apr 24 09:01:00 thinpower [1349916.950056] CPU 5: hi: 186, btch:
31 usd: 102
Apr 24 09:01:00 thinpower [1349916.950057] CPU 6: hi: 186, btch:
31 usd: 179
Apr 24 09:01:00 thinpower [1349916.950058] CPU 7: hi: 186, btch:
31 usd: 123
Apr 24 09:01:00 thinpower [1349916.950061] Active_anon:7600155
active_file:0 inactive_anon:552800
Apr 24 09:01:00 thinpower [1349916.950061] inactive_file:5
unevictable:0 dirty:0 writeback:0 unstable:0
Apr 24 09:01:00 thinpower [1349916.950062] free:38559 slab:12539
mapped:2 pagetables:22733 bounce:0
Apr 24 09:01:00 thinpower [1349916.950064] Node 0 DMA free:15700kB
min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB present:15156kB
pages_scanned:0 all_unreclaimable? yes
Apr 24 09:01:00 thinpower [1349916.950067] lowmem_reserve[]: 0 3255 32292 32292
Apr 24 09:01:00 thinpower [1349916.950070] Node 0 DMA32 free:117732kB
min:2316kB low:2892kB high:3472kB active_anon:2266592kB
inactive_anon:453072kB active_file:0kB inactive_file:4kB
unevictable:0kB present:3333280kB pages_scanned:1587467
all_unreclaimable? no
Apr 24 09:01:00 thinpower [1349916.950074] lowmem_reserve[]: 0 0 29037 29037
Apr 24 09:01:00 thinpower [1349916.950076] Node 0 Normal free:20804kB
min:20672kB low:25840kB high:31008kB active_anon:28134028kB
inactive_anon:1758128kB active_file:0kB inactive_file:16kB
unevictable:0kB present:29734400kB pages_scanned:896
all_unreclaimable? no
Apr 24 09:01:00 thinpower [1349916.950080] lowmem_reserve[]: 0 0 0 0
Apr 24 09:01:00 thinpower [1349916.950082] Node 0 DMA: 1*4kB 0*8kB
1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB
3*4096kB = 15700kB
Apr 24 09:01:00 thinpower [1349916.950089] Node 0 DMA32: 159*4kB
127*8kB 59*16kB 14*32kB 12*64kB 6*128kB 20*256kB 15*512kB 6*1024kB
2*2048kB 22*4096kB = 117732kB
Apr 24 09:01:00 thinpower [1349916.950095] Node 0 Normal: 93*4kB 8*8kB
54*16kB 11*32kB 2*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB
4*4096kB = 21108kB
Apr 24 09:01:00 thinpower [1349916.950101] 6946 total pagecache pages
Apr 24 09:01:00 thinpower [1349916.950102] 0 pages in swap cache
Apr 24 09:01:00 thinpower [1349916.950104] Swap cache stats: add 0,
delete 0, find 0/0
Apr 24 09:01:00 thinpower [1349916.950105] Free swap = 0kB
Apr 24 09:01:00 thinpower [1349916.950106] Total swap = 0kB
Apr 24 09:01:00 thinpower [1349917.149549] 8388592 pages RAM
Apr 24 09:01:00 thinpower [1349917.149551] 136791 pages reserved
Apr 24 09:01:00 thinpower [1349917.149552] 66429 pages shared
Apr 24 09:01:00 thinpower [1349917.149553] 8206765 pages non-shared
Apr 24 09:01:00 thinpower [1349917.149556] Out of memory: kill process
21692 (ruby) score 58368 or a child
Apr 24 09:01:00 thinpower [1349917.149621] Killed process 21692 (ruby)
Apr 24 09:01:06 thinpower [1349923.491914] postgres invoked
oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Apr 24 09:01:06 thinpower [1349923.491919] postgres cpuset=/ mems_allowed=0
Apr 24 09:01:06 thinpower [1349923.491922] Pid: 2393, comm: postgres
Not tainted 2.6.29 #5
Apr 24 09:01:06 thinpower [1349923.491924] Call Trace:
Apr 24 09:01:06 thinpower [1349923.491934] [<ffffffff8106eda9>] ?
cpuset_print_task_mems_allowed+0x99/0x9f
Apr 24 09:01:06 thinpower [1349923.491939] [<ffffffff81083d28>]
oom_kill_process+0x96/0x246
Apr 24 09:01:06 thinpower [1349923.491942] [<ffffffff8106dcf0>] ?
cpuset_mems_allowed_intersects+0x1c/0x1e
Apr 24 09:01:06 thinpower [1349923.491944] [<ffffffff810840a5>] ?
badness+0x1a3/0x1e6
Apr 24 09:01:06 thinpower [1349923.491946] [<ffffffff8108421c>]
__out_of_memory+0x134/0x14b
Apr 24 09:01:06 thinpower [1349923.491949] [<ffffffff8108438b>]
out_of_memory+0x158/0x18a
Apr 24 09:01:06 thinpower [1349923.491951] [<ffffffff81087906>]
__alloc_pages_internal+0x372/0x434
Apr 24 09:01:06 thinpower [1349923.491954] [<ffffffff810aa490>]
alloc_pages_current+0xb9/0xc2
Apr 24 09:01:06 thinpower [1349923.491957] [<ffffffff81081a59>]
__page_cache_alloc+0x67/0x6b
Apr 24 09:01:06 thinpower [1349923.491959] [<ffffffff81089426>]
__do_page_cache_readahead+0x96/0x192
Apr 24 09:01:06 thinpower [1349923.491961] [<ffffffff81089575>]
do_page_cache_readahead+0x53/0x60
Apr 24 09:01:06 thinpower [1349923.491963] [<ffffffff81081ec7>]
filemap_fault+0x15e/0x313
Apr 24 09:01:06 thinpower [1349923.491967] [<ffffffff81094451>]
__do_fault+0x53/0x393
Apr 24 09:01:06 thinpower [1349923.491969] [<ffffffff81030531>] ?
__wake_up_sync+0x45/0x4e
Apr 24 09:01:06 thinpower [1349923.491972] [<ffffffff81096702>]
handle_mm_fault+0x36b/0x854
Apr 24 09:01:06 thinpower [1349923.491976] [<ffffffff813a1acb>] ?
release_sock+0xb0/0xbb
Apr 24 09:01:06 thinpower [1349923.491979] [<ffffffff81022c8f>] ?
default_spin_lock_flags+0x9/0xe
Apr 24 09:01:06 thinpower [1349923.491984] [<ffffffff814ac6fe>]
do_page_fault+0x662/0xa5c
Apr 24 09:01:06 thinpower [1349923.491988] [<ffffffff81400b8e>] ?
inet_sendmsg+0x46/0x53
Apr 24 09:01:06 thinpower [1349923.491992] [<ffffffff8139ed97>] ?
__sock_sendmsg+0x59/0x62
Apr 24 09:01:06 thinpower [1349923.491994] [<ffffffff8139f6d8>] ?
sock_sendmsg+0xc7/0xe0
Apr 24 09:01:06 thinpower [1349923.491996] [<ffffffff81086b5e>] ?
free_pages+0x32/0x36
Apr 24 09:01:06 thinpower [1349923.492000] [<ffffffff8104e7ec>] ?
autoremove_wake_function+0x0/0x38
Apr 24 09:01:06 thinpower [1349923.492004] [<ffffffff810c1d76>] ?
core_sys_select+0x1df/0x213
Apr 24 09:01:06 thinpower [1349923.492007] [<ffffffff810b2cc2>] ?
nameidata_to_filp+0x41/0x52
Apr 24 09:01:06 thinpower [1349923.492009] [<ffffffff8139f028>] ?
sockfd_lookup_light+0x1b/0x54
Apr 24 09:01:06 thinpower [1349923.492013] [<ffffffff81011ac8>] ?
read_tsc+0xe/0x24
Apr 24 09:01:06 thinpower [1349923.492017] [<ffffffff81055990>] ?
getnstimeofday+0x58/0xb4
Apr 24 09:01:06 thinpower [1349923.492019] [<ffffffff810516ed>] ?
ktime_get_ts+0x49/0x4e
Apr 24 09:01:06 thinpower [1349923.492022] [<ffffffff810c0d35>] ?
poll_select_copy_remaining+0xc5/0xea
Apr 24 09:01:06 thinpower [1349923.492023] [<ffffffff810c1ff3>] ?
sys_select+0xa7/0xbc
Apr 24 09:01:06 thinpower [1349923.492026] [<ffffffff814aa305>]
page_fault+0x25/0x30
Apr 24 09:01:06 thinpower [1349923.492028] Mem-Info:
Apr 24 09:01:06 thinpower [1349923.492029] Node 0 DMA per-cpu:
Apr 24 09:01:06 thinpower [1349923.492031] CPU 0: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492032] CPU 1: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492034] CPU 2: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492035] CPU 3: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492036] CPU 4: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492037] CPU 5: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492039] CPU 6: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492040] CPU 7: hi: 0, btch:
1 usd: 0
Apr 24 09:01:06 thinpower [1349923.492041] Node 0 DMA32 per-cpu:
Apr 24 09:01:06 thinpower [1349923.492043] CPU 0: hi: 186, btch:
31 usd: 29
Apr 24 09:01:06 thinpower [1349923.492044] CPU 1: hi: 186, btch:
31 usd: 17
Apr 24 09:01:06 thinpower [1349923.492046] CPU 2: hi: 186, btch:
31 usd: 176
Apr 24 09:01:06 thinpower [1349923.492047] CPU 3: hi: 186, btch:
31 usd: 19
Apr 24 09:01:06 thinpower [1349923.492048] CPU 4: hi: 186, btch:
31 usd: 29
Apr 24 09:01:06 thinpower [1349923.492050] CPU 5: hi: 186, btch:
31 usd: 0
Apr 24 09:01:06 thinpower [1349923.492051] CPU 6: hi: 186, btch:
31 usd: 0
Apr 24 09:01:06 thinpower [1349923.492052] CPU 7: hi: 186, btch:
31 usd: 31
Apr 24 09:01:06 thinpower [1349923.492053] Node 0 Normal per-cpu:
Apr 24 09:01:06 thinpower [1349923.492055] CPU 0: hi: 186, btch:
31 usd: 50
Apr 24 09:01:06 thinpower [1349923.492056] CPU 1: hi: 186, btch:
31 usd: 20
Apr 24 09:01:06 thinpower [1349923.492057] CPU 2: hi: 186, btch:
31 usd: 42
Apr 24 09:01:06 thinpower [1349923.492059] CPU 3: hi: 186, btch:
31 usd: 124
Apr 24 09:01:06 thinpower [1349923.492060] CPU 4: hi: 186, btch:
31 usd: 104
Apr 24 09:01:06 thinpower [1349923.492061] CPU 5: hi: 186, btch:
31 usd: 95
Apr 24 09:01:06 thinpower [1349923.492062] CPU 6: hi: 186, btch:
31 usd: 178
Apr 24 09:01:06 thinpower [1349923.492064] CPU 7: hi: 186, btch:
31 usd: 60
Apr 24 09:01:06 thinpower [1349923.492066] Active_anon:7595417
active_file:641 inactive_anon:551929
Apr 24 09:01:06 thinpower [1349923.492067] inactive_file:5122
unevictable:0 dirty:0 writeback:0 unstable:0
Apr 24 09:01:06 thinpower [1349923.492068] free:39401 slab:12443
mapped:1801 pagetables:22566 bounce:0
Apr 24 09:01:06 thinpower [1349923.492069] Node 0 DMA free:15700kB
min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB present:15156kB
pages_scanned:0 all_unreclaimable? yes
Apr 24 09:01:06 thinpower [1349923.492073] lowmem_reserve[]: 0 3255 32292 32292
Apr 24 09:01:06 thinpower [1349923.492076] Node 0 DMA32 free:118848kB
min:2316kB low:2892kB high:3472kB active_anon:2265276kB
inactive_anon:452944kB active_file:4kB inactive_file:0kB
unevictable:0kB present:3333280kB pages_scanned:2692832
all_unreclaimable? no
Apr 24 09:01:06 thinpower [1349923.492080] lowmem_reserve[]: 0 0 29037 29037
Apr 24 09:01:06 thinpower [1349923.492082] Node 0 Normal free:23056kB
min:20672kB low:25840kB high:31008kB active_anon:28116392kB
inactive_anon:1754772kB active_file:2560kB inactive_file:20488kB
unevictable:0kB present:29734400kB pages_scanned:77696
all_unreclaimable? no
Apr 24 09:01:06 thinpower [1349923.492086] lowmem_reserve[]: 0 0 0 0
Apr 24 09:01:06 thinpower [1349923.492088] Node 0 DMA: 1*4kB 0*8kB
1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB
3*4096kB = 15700kB
Apr 24 09:01:06 thinpower [1349923.492095] Node 0 DMA32: 172*4kB
134*8kB 66*16kB 18*32kB 18*64kB 7*128kB 21*256kB 15*512kB 6*1024kB
2*2048kB 22*4096kB = 118848kB
Apr 24 09:01:06 thinpower [1349923.492101] Node 0 Normal: 251*4kB
24*8kB 144*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB
1*2048kB 4*4096kB = 22924kB
Apr 24 09:01:06 thinpower [1349923.492107] 12705 total pagecache pages
Apr 24 09:01:06 thinpower [1349923.492108] 0 pages in swap cache
Apr 24 09:01:06 thinpower [1349923.492109] Swap cache stats: add 0,
delete 0, find 0/0
Apr 24 09:01:06 thinpower [1349923.492110] Free swap = 0kB
Apr 24 09:01:06 thinpower [1349923.492111] Total swap = 0kB
Apr 24 09:01:06 thinpower [1349923.693323] 8388592 pages RAM
Apr 24 09:01:06 thinpower [1349923.693326] 136791 pages reserved
Apr 24 09:01:06 thinpower [1349923.693327] 69633 pages shared
Apr 24 09:01:06 thinpower [1349923.693328] 8205493 pages non-shared
Apr 24 09:01:06 thinpower [1349923.693331] Out of memory: kill process
21490 (apache2) score 53801 or a child
Apr 24 09:01:06 thinpower [1349923.693410] Killed process 21490 (apache2)

> If this is reproducible, I'd recommend enabling
> /proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
> and show us what may be causing the livelock.

Ok, how do I enable that? I will google for it.

If we can reproduce it, I will let you know.

>> Before that we had this:
>>
>> Apr 24 08:59:24 thinpower [1349806.982419] 6954 total pagecache pages
>> Apr 24 08:59:24 thinpower [1349806.982420] 0 pages in swap cache
>> Apr 24 08:59:24 thinpower [1349806.982422] Swap cache stats: add 0,
>> delete 0, find 0/0
>> Apr 24 08:59:24 thinpower [1349806.982423] Free swap ?= 0kB
>> Apr 24 08:59:24 thinpower [1349806.982423] Total swap = 0kB
>> Apr 24 08:59:24 thinpower [1349807.182725] 8388592 pages RAM
>> Apr 24 08:59:24 thinpower [1349807.182728] 136791 pages reserved
>> Apr 24 08:59:24 thinpower [1349807.182729] 35053 pages shared
>> Apr 24 08:59:24 thinpower [1349807.182729] 8207346 pages non-shared
>> Apr 24 08:59:24 thinpower [1349807.182732] Out of memory: kill process
>> 21592 (ruby) score 109489 or a child
>> Apr 24 08:59:24 thinpower [1349807.182795] Killed process 21592 (ruby)
>> Apr 24 09:00:59 thinpower [1349916.949909] apache2 invoked oom-killer:
>> gfp_mask=0x1201d2, order=0, oomkilladj=0
>
> apache2 in this case (and postgres later) are valid oom killer targets
> because their /proc/pid/oom_adj score is 0. ?The only reason they wouldn't
> be getting killed subsequently is because the already killed tasks, ruby,
> is PF_EXITING and has failed to exit. ?It has access to memory reserves
> and if it hangs before dying, the machine will livelock.

Thanks for the info. We will try to chew on this.

Best
Zeno

2009-04-24 18:24:21

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Fri, Apr 24, 2009 at 7:17 PM, David Rientjes <[email protected]> wrote:
> On Fri, 24 Apr 2009, Zeno Davatz wrote:
>
>> Dear All
>>
>> Our Kernel-Version:
>>
>> Linux thinpower 2.6.29 #5 SMP Sun Apr 5 20:04:08 UTC 2009 x86_64
>> Intel(R) Xeon(R) CPU E5450 @ 3.00GHz GenuineIntel GNU/Linux
>>
>> This morning our Server ran out of memory and had to be hard rebooted.
>> The system hast 32 GB or memory and 4 QuadCore Xeon CPUs.
>> /var/log/kernel/current tells us the following:
>>
>> Apr 24 09:01:06 thinpower [1349923.491914] postgres invoked
>> oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
>> Apr 24 09:01:06 thinpower [1349923.491919] postgres cpuset=/ mems_allowed=0
>> Apr 24 09:01:06 thinpower [1349923.491922] Pid: 2393, comm: postgres
>> Not tainted 2.6.29 #5
>> Apr 24 09:01:06 thinpower [1349923.491924] Call Trace:
>> Apr 24 09:01:06 thinpower [1349923.491934] ?[<ffffffff8106eda9>] ?
>> cpuset_print_task_mems_allowed+0x99/0x9f
>> Apr 24 09:01:06 thinpower [1349923.491939] ?[<ffffffff81083d28>]
>> oom_kill_process+0x96/0x246
>> Apr 24 09:01:06 thinpower [1349923.491942] ?[<ffffffff8106dcf0>] ?
>> cpuset_mems_allowed_intersects+0x1c/0x1e
>> Apr 24 09:01:06 thinpower [1349923.491944] ?[<ffffffff810840a5>] ?
>> badness+0x1a3/0x1e6
>> Apr 24 09:01:06 thinpower [1349923.491946] ?[<ffffffff8108421c>]
>> __out_of_memory+0x134/0x14b
>> Apr 24 09:01:06 thinpower [1349923.491949] ?[<ffffffff8108438b>]
>> out_of_memory+0x158/0x18a
>> Apr 24 09:01:06 thinpower [1349923.491951] ?[<ffffffff81087906>]
>> __alloc_pages_internal+0x372/0x434
>> Apr 24 09:01:06 thinpower [1349923.491954] ?[<ffffffff810aa490>]
>> alloc_pages_current+0xb9/0xc2
>> Apr 24 09:01:06 thinpower [1349923.491957] ?[<ffffffff81081a59>]
>> __page_cache_alloc+0x67/0x6b
>> Apr 24 09:01:06 thinpower [1349923.491959] ?[<ffffffff81089426>]
>> __do_page_cache_readahead+0x96/0x192
>> Apr 24 09:01:06 thinpower [1349923.491961] ?[<ffffffff81089575>]
>> do_page_cache_readahead+0x53/0x60
>> Apr 24 09:01:06 thinpower [1349923.491963] ?[<ffffffff81081ec7>]
>> filemap_fault+0x15e/0x313
>> Apr 24 09:01:06 thinpower [1349923.491967] ?[<ffffffff81094451>]
>> __do_fault+0x53/0x393
>> Apr 24 09:01:06 thinpower [1349923.491969] ?[<ffffffff81030531>] ?
>> __wake_up_sync+0x45/0x4e
>> Apr 24 09:01:06 thinpower [1349923.491972] ?[<ffffffff81096702>]
>> handle_mm_fault+0x36b/0x854
>> Apr 24 09:01:06 thinpower [1349923.491976] ?[<ffffffff813a1acb>] ?
>> release_sock+0xb0/0xbb
>> Apr 24 09:01:06 thinpower [1349923.491979] ?[<ffffffff81022c8f>] ?
>> default_spin_lock_flags+0x9/0xe
>> Apr 24 09:01:06 thinpower [1349923.491984] ?[<ffffffff814ac6fe>]
>> do_page_fault+0x662/0xa5c
>> Apr 24 09:01:06 thinpower [1349923.491988] ?[<ffffffff81400b8e>] ?
>> inet_sendmsg+0x46/0x53
>> Apr 24 09:01:06 thinpower [1349923.491992] ?[<ffffffff8139ed97>] ?
>> __sock_sendmsg+0x59/0x62
>> Apr 24 09:01:06 thinpower [1349923.491994] ?[<ffffffff8139f6d8>] ?
>> sock_sendmsg+0xc7/0xe0
>> Apr 24 09:01:06 thinpower [1349923.491996] ?[<ffffffff81086b5e>] ?
>> free_pages+0x32/0x36
>> Apr 24 09:01:06 thinpower [1349923.492000] ?[<ffffffff8104e7ec>] ?
>> autoremove_wake_function+0x0/0x38
>> Apr 24 09:01:06 thinpower [1349923.492004] ?[<ffffffff810c1d76>] ?
>> core_sys_select+0x1df/0x213
>> Apr 24 09:01:06 thinpower [1349923.492007] ?[<ffffffff810b2cc2>] ?
>> nameidata_to_filp+0x41/0x52
>> Apr 24 09:01:06 thinpower [1349923.492009] ?[<ffffffff8139f028>] ?
>> sockfd_lookup_light+0x1b/0x54
>> Apr 24 09:01:06 thinpower [1349923.492013] ?[<ffffffff81011ac8>] ?
>> read_tsc+0xe/0x24
>> Apr 24 09:01:06 thinpower [1349923.492017] ?[<ffffffff81055990>] ?
>> getnstimeofday+0x58/0xb4
>> Apr 24 09:01:06 thinpower [1349923.492019] ?[<ffffffff810516ed>] ?
>> ktime_get_ts+0x49/0x4e
>> Apr 24 09:01:06 thinpower [1349923.492022] ?[<ffffffff810c0d35>] ?
>> poll_select_copy_remaining+0xc5/0xea
>> Apr 24 09:01:06 thinpower [1349923.492023] ?[<ffffffff810c1ff3>] ?
>> sys_select+0xa7/0xbc
>> Apr 24 09:01:06 thinpower [1349923.492026] ?[<ffffffff814aa305>]
>> page_fault+0x25/0x30
>>
>
> Are these the last messages in the dmesg? ?There should be subsequent
> information that describes the current state of memory following the stack
> trace.
>
> If this is reproducible, I'd recommend enabling
> /proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
> and show us what may be causing the livelock.

Ok, I believe I enabled it with

echo 1 > /proc/sys/vm/oom_dump_tasks

Where would I find the output if it livelocks again?

Thank you for your Feedback.

Best
Zeno

2009-04-24 19:00:31

by David Rientjes

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Fri, 24 Apr 2009, Zeno Davatz wrote:

> Apr 24 09:01:06 thinpower [1349923.693331] Out of memory: kill process
> 21490 (apache2) score 53801 or a child
> Apr 24 09:01:06 thinpower [1349923.693410] Killed process 21490 (apache2)
>

If your machine hangs here, then it's most likely because apache2 is
getting stuck in D state and cannot exit (and it has access to memory
reserves because of TIF_MEMDIE since it has been oom killed, so it may
deplete all memory).

I'm assuming that you're describing a machine hang as the inability to
ping it or ssh into it, not simply your apache server dying.

These types of livelocks are possible with the oom killer when a task
fails to exit, one possible way to fix that is to introduce an oom killer
timeout such that if a task fails to exit for a pre-defined period of
time, the oom killer will choose to kill another task in the hopes of
future memory freeing. The problem with that approach, however, is that
the hung task can consume an enormous amount of memory that will never be
freed.

> > If this is reproducible, I'd recommend enabling
> > /proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
> > and show us what may be causing the livelock.
>
> Ok, how do I enable that? I will google for it.
>

You're right in your reply, you can enable it with

echo 1 > /proc/sys/vm/oom_dump_tasks

This will print the tasklist and some pertinent information alongside the
oom killer output you've already posted. It will give a better idea of
the memory usage on the machine and if killing a subsequent task would
actually help in this case.

2009-04-24 19:16:59

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

On Fri, Apr 24, 2009 at 9:00 PM, David Rientjes <[email protected]> wrote:
> On Fri, 24 Apr 2009, Zeno Davatz wrote:
>
>> Apr 24 09:01:06 thinpower [1349923.693331] Out of memory: kill process
>> 21490 (apache2) score 53801 or a child
>> Apr 24 09:01:06 thinpower [1349923.693410] Killed process 21490 (apache2)
>>
>
> If your machine hangs here, then it's most likely because apache2 is
> getting stuck in D state and cannot exit (and it has access to memory
> reserves because of TIF_MEMDIE since it has been oom killed, so it may
> deplete all memory).
>
> I'm assuming that you're describing a machine hang as the inability to
> ping it or ssh into it, not simply your apache server dying.

Yes correct. I could neither SSH into the machine nor could I type
anything on the screen after I connected the monitor and the keyboard
directly to the machine.

> These types of livelocks are possible with the oom killer when a task
> fails to exit, one possible way to fix that is to introduce an oom killer
> timeout such that if a task fails to exit for a pre-defined period of
> time, the oom killer will choose to kill another task in the hopes of
> future memory freeing. ?The problem with that approach, however, is that
> the hung task can consume an enormous amount of memory that will never be
> freed.

Thanks for the hint! Is there another solution as well? Any
Kernel-Upgrades in the Pipeline? What does Linus think about this?

>> > If this is reproducible, I'd recommend enabling
>> > /proc/sys/vm/oom_dump_tasks so that the oom killer will dump the tasklist
>> > and show us what may be causing the livelock.
>>
>> Ok, how do I enable that? I will google for it.
>>
>
> You're right in your reply, you can enable it with
>
> ? ? ? ?echo 1 > /proc/sys/vm/oom_dump_tasks
>
> This will print the tasklist and some pertinent information alongside the
> oom killer output you've already posted. ?It will give a better idea of
> the memory usage on the machine and if killing a subsequent task would
> actually help in this case.

Ok done that. Actually not looking forward that it hangs again but if
it does we should catch some fish. ;)

Best
Zeno

2009-04-24 19:27:19

by David Rientjes

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Fri, 24 Apr 2009, Zeno Davatz wrote:

> > These types of livelocks are possible with the oom killer when a task
> > fails to exit, one possible way to fix that is to introduce an oom killer
> > timeout such that if a task fails to exit for a pre-defined period of
> > time, the oom killer will choose to kill another task in the hopes of
> > future memory freeing. ?The problem with that approach, however, is that
> > the hung task can consume an enormous amount of memory that will never be
> > freed.
>
> Thanks for the hint! Is there another solution as well? Any
> Kernel-Upgrades in the Pipeline? What does Linus think about this?
>

I had a patchset that added a timeout for oom killed tasks now that the
entire oom killer is serialized, but it only really works well if your
memory is so partitioned (such as with cpusets or the memory controller)
that it actually makes sense to continue. With global system-wide ooms
such as yours, it would only result in a long chain of kills with no
positive outcome.

2009-04-25 08:33:04

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

On Fri, Apr 24, 2009 at 9:26 PM, David Rientjes <[email protected]> wrote:
> On Fri, 24 Apr 2009, Zeno Davatz wrote:
>
>> > These types of livelocks are possible with the oom killer when a task
>> > fails to exit, one possible way to fix that is to introduce an oom killer
>> > timeout such that if a task fails to exit for a pre-defined period of
>> > time, the oom killer will choose to kill another task in the hopes of
>> > future memory freeing. ?The problem with that approach, however, is that
>> > the hung task can consume an enormous amount of memory that will never be
>> > freed.
>>
>> Thanks for the hint! Is there another solution as well? Any
>> Kernel-Upgrades in the Pipeline? What does Linus think about this?
>>
>
> I had a patchset that added a timeout for oom killed tasks now that the
> entire oom killer is serialized, but it only really works well if your
> memory is so partitioned (such as with cpusets or the memory controller)
> that it actually makes sense to continue. ?With global system-wide ooms
> such as yours, it would only result in a long chain of kills with no
> positive outcome.

Couldn't you start killing processes at an earlier stage? It seems to
me that the stage when processes are being killed by oom_killer is a
bit late in the game, when the machine is anyway very close to the
freezing point. I mean we had these messages of oom_kill several times
in our /var/log/kernel/current over a distributed amount of time.

Of course we now also set a roadblock to our main application
(http://ch.oddb.org) that kills the application if it uses more then
10 GB of memory.

Best
Zeno

2009-04-26 19:55:45

by David Rientjes

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Sat, 25 Apr 2009, Zeno Davatz wrote:

> Couldn't you start killing processes at an earlier stage? It seems to
> me that the stage when processes are being killed by oom_killer is a
> bit late in the game, when the machine is anyway very close to the
> freezing point. I mean we had these messages of oom_kill several times
> in our /var/log/kernel/current over a distributed amount of time.
>

The oom killer serves a very well-defined purpose: to kill a
memory-hogging task when your system (or cpuset, memory controller) is out
of memory. When it kills a task, it also allows access to memory reserves
that allow it to allocate beyond each allowable zone's min watermark.
This is intended to allow the task to quickly exit.

There is work being done on a per-cgroup lowmem notifier which would allow
userspace to poll() on a device file and be notified of low memory
situations so that it can react on its own (like killing a low priority
task, expanding a cpuset, increasing a memcontroller limit, etc.)

2009-04-27 06:33:48

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

On Sun, Apr 26, 2009 at 9:55 PM, David Rientjes <[email protected]> wrote:
> On Sat, 25 Apr 2009, Zeno Davatz wrote:
>
>> Couldn't you start killing processes at an earlier stage? It seems to
>> me that the stage when processes are being killed by oom_killer is a
>> bit late in the game, when the machine is anyway very close to the
>> freezing point. I mean we had these messages of oom_kill several times
>> in our /var/log/kernel/current over a distributed amount of time.
>>
>
> The oom killer serves a very well-defined purpose: to kill a
> memory-hogging task when your system (or cpuset, memory controller) is out
> of memory.

Maybe sometimes to late in the process when the system _is_ already
out of memory ;) - because one task just uses 99% of the memory.

>?When it kills a task, it also allows access to memory reserves
> that allow it to allocate beyond each allowable zone's min watermark.
> This is intended to allow the task to quickly exit.

Can you recommend a link where I can read up on this issue / problem?
It seems to me when the system is already out of memory it does not
have enough "memory" for freeing itself for some fresh memory,
specially if there is one application, that uses about 99% of the
memory. I would suggest, that in such a case it should just kill the
task that uses the most memory.

> There is work being done on a per-cgroup lowmem notifier which would allow
> userspace to poll() on a device file and be notified of low memory
> situations so that it can react on its own (like killing a low priority
> task, expanding a cpuset, increasing a memcontroller limit, etc.)

This sounds very interesting. Would this be working together with
udev? How would it got about to kill a task like "Apache" or "Ruby" or
any other application that is running on the linux machine.

Best
Zeno

2009-04-27 08:33:18

by David Rientjes

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Mon, 27 Apr 2009, Zeno Davatz wrote:

> Can you recommend a link where I can read up on this issue / problem?
> It seems to me when the system is already out of memory it does not
> have enough "memory" for freeing itself for some fresh memory,
> specially if there is one application, that uses about 99% of the
> memory. I would suggest, that in such a case it should just kill the
> task that uses the most memory.
>

All allowable zones have minimum watermarks that define how much memory
an oom killed task may use so that it can quickly exit. This is the
purpose of the TIF_MEMDIE flag in the kernel.

> This sounds very interesting. Would this be working together with
> udev? How would it got about to kill a task like "Apache" or "Ruby" or
> any other application that is running on the linux machine.
>

The policy is better defined in userspace to determine what memory-hogging
task should be killed in low memory situations. Users can already tune
the /proc/pid/oom_adj score of any task so that the oom killer can prefer
it over other tasks (see Documentation/filesystems/proc.txt). For low
memory situations, however, userspace can act in any number of ways to
prevent reaching an oom condition.

Once the oom condition has been reached, however, the oom killer must act
to free some memory, otherwise the kernel usually livelocks.

2009-04-27 08:54:13

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

On Mon, Apr 27, 2009 at 10:32 AM, David Rientjes <[email protected]> wrote:

> All allowable zones have minimum watermarks that define how much memory
> an oom killed task may use so that it can quickly exit. ?This is the
> purpose of the TIF_MEMDIE flag in the kernel.

Thanks for the info.

> The policy is better defined in userspace to determine what memory-hogging
> task should be killed in low memory situations. ?Users can already tune
> the /proc/pid/oom_adj score of any task so that the oom killer can prefer
> it over other tasks (see Documentation/filesystems/proc.txt). ?For low
> memory situations, however, userspace can act in any number of ways to
> prevent reaching an oom condition.

This may be off-Topic, but: Can I assign a PID to a process at
startup? Say Apache always should have PID 2755, or Ruby should always
have PID 1003. Otherwise how would I set the score for a PID at
Kernel-Boop-Up?

> Once the oom condition has been reached, however, the oom killer must act
> to free some memory, otherwise the kernel usually livelocks.

Interesting. In our case we did not have a score for Apache in

/proc/pid/oom_adj

that must have caused the livelock because oom_kill could not kill the
biggest task (Apache) in full.

Best
Zeno

2009-04-27 12:03:52

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear David

Thanks for all your help and input. It was very helpful ans useful.

On Mon, Apr 27, 2009 at 10:53 AM, Zeno Davatz <[email protected]> wrote:
> On Mon, Apr 27, 2009 at 10:32 AM, David Rientjes <[email protected]> wrote:
>
>> All allowable zones have minimum watermarks that define how much memory
>> an oom killed task may use so that it can quickly exit. ?This is the
>> purpose of the TIF_MEMDIE flag in the kernel.
>
> Thanks for the info.
>
>> The policy is better defined in userspace to determine what memory-hogging
>> task should be killed in low memory situations. ?Users can already tune
>> the /proc/pid/oom_adj score of any task so that the oom killer can prefer
>> it over other tasks (see Documentation/filesystems/proc.txt). ?For low
>> memory situations, however, userspace can act in any number of ways to
>> prevent reaching an oom condition.
>
> This may be off-Topic, but: Can I assign a PID to a process at
> startup? Say Apache always should have PID 2755, or Ruby should always
> have PID 1003. Otherwise how would I set the score for a PID at
> Kernel-Boop-Up?

Ok, we solved this issue, with the following commit:

http://scm.ywesee.com/?p=oddb.org;a=blobdiff;f=bin/oddbd;h=655b4c235594159d3fe4290696ecc54580188c7c;hp=9cb744b4affe30beee935a495223015d9ce090ce;hb=6d38f6839f746bcf2dc701528e343170af8f47be;hpb=a8b4f8c43200a0035df9f5002a5ea9ec8133e99e

Best
Zeno

2009-04-27 13:35:51

by Greg Stark

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Sun, Apr 26, 2009 at 8:55 PM, David Rientjes <[email protected]> wrote:
> There is work being done on a per-cgroup lowmem notifier which would allow
> userspace to poll() on a device file and be notified of low memory
> situations so that it can react on its own (like killing a low priority
> task, expanding a cpuset, increasing a memcontroller limit, etc.)

Surely anything that important ought to be running with overcommit
disabled anyways.

2009-04-27 13:39:55

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

Dear Greg

On Mon, Apr 27, 2009 at 3:35 PM, Greg Stark <[email protected]> wrote:
> On Sun, Apr 26, 2009 at 8:55 PM, David Rientjes <[email protected]> wrote:
>> There is work being done on a per-cgroup lowmem notifier which would allow
>> userspace to poll() on a device file and be notified of low memory
>> situations so that it can react on its own (like killing a low priority
>> task, expanding a cpuset, increasing a memcontroller limit, etc.)
>
> Surely anything that important ought to be running with overcommit
> disabled anyways.

Where do I enable or disable overcommit?

Best
Zeno

2009-04-27 14:07:50

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Mon, Apr 27, 2009 at 3:39 PM, Zeno Davatz <[email protected]> wrote:
> On Mon, Apr 27, 2009 at 3:35 PM, Greg Stark <[email protected]> wrote:
>> On Sun, Apr 26, 2009 at 8:55 PM, David Rientjes <[email protected]> wrote:
>>> There is work being done on a per-cgroup lowmem notifier which would allow
>>> userspace to poll() on a device file and be notified of low memory
>>> situations so that it can react on its own (like killing a low priority
>>> task, expanding a cpuset, increasing a memcontroller limit, etc.)
>>
>> Surely anything that important ought to be running with overcommit
>> disabled anyways.
>
> Where do I enable or disable overcommit?

Ok, I found this one as well:

Documentation/vm/overcommit-accounting

and

http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html?page=2

is a useful article for the beginner.

Best
Zeno

2009-04-27 14:52:00

by Zeno Davatz

[permalink] [raw]
Subject: Re: Kernel 2.6.29 runs out of memory and hangs.

On Mon, Apr 27, 2009 at 3:35 PM, Greg Stark <[email protected]> wrote:
> On Sun, Apr 26, 2009 at 8:55 PM, David Rientjes <[email protected]> wrote:
>> There is work being done on a per-cgroup lowmem notifier which would allow
>> userspace to poll() on a device file and be notified of low memory
>> situations so that it can react on its own (like killing a low priority
>> task, expanding a cpuset, increasing a memcontroller limit, etc.)
>
> Surely anything that important ought to be running with overcommit
> disabled anyways.

Ok, I tried this suggestion as well. I done:

echo 2 > /proc/sys/vm/overcommit_memory

echo 0 > /proc/sys/vm/overcommit_ratio (Note: our system does not have
any swap, just 32 GB of memory).

Then I tried to run "htop" and got

bash: fork: Cannot allocate memory

I also tried it with

echo 1 > /proc/sys/vm/overcommit_ratio

Same result.

Best
Zeno