Hi list
ProLiant DL360 G5, Dual Xeon E5345 @ 2.33GHz, 4Gb RAM
kernel 2.6.24.4
samba/slapd/heartbeat/drbd/mon
Last night mon trigger failover because slapd died.
This was in messages:
Jun 14 01:00:17 pippo kernel: sendmail-mta invoked oom-killer:
gfp_mask=0x800d0, order=0, oomkilladj=0
Jun 14 01:00:17 pippo kernel: Pid: 4520, comm: sendmail-mta Not
tainted 2.6.24.4 #2
Jun 14 01:00:17 pippo kernel: [<c014a819>] oom_kill_process+0x54/0xf8
Jun 14 01:00:17 pippo kernel: [<c014ac00>] out_of_memory+0x15c/0x190
Jun 14 01:00:17 pippo kernel: [<c014c7b2>] __alloc_pages+0x239/0x2c7
Jun 14 01:00:17 pippo kernel: [<c01661d5>] cp_new_stat64+0xfc/0x10e
Jun 14 01:00:17 pippo kernel: [<c014c879>] __get_free_pages+0x39/0x47
Jun 14 01:00:17 pippo kernel: [<c018f8cd>] proc_file_read+0x78/0x237
Jun 14 01:00:17 pippo kernel: [<c018f855>] proc_file_read+0x0/0x237
Jun 14 01:00:17 pippo kernel: [<c018c3c1>] proc_reg_read+0x5c/0x6f
Jun 14 01:00:17 pippo kernel: [<c018c365>] proc_reg_read+0x0/0x6f
Jun 14 01:00:17 pippo kernel: [<c0163f99>] vfs_read+0x9f/0x121
Jun 14 01:00:17 pippo kernel: [<c0164396>] sys_read+0x41/0x67
Jun 14 01:00:17 pippo kernel: [<c0103d56>] sysenter_past_esp+0x5f/0x85
Jun 14 01:00:17 pippo kernel: =======================
Jun 14 01:00:17 pippo kernel: Mem-info:
Jun 14 01:00:17 pippo kernel: DMA per-cpu:
Jun 14 01:00:17 pippo kernel: CPU 0: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 1: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 2: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 3: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 4: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 5: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 6: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 7: Hot: hi: 0, btch: 1 usd:
0 Cold: hi: 0, btch: 1 usd: 0
Jun 14 01:00:17 pippo kernel: Normal per-cpu:
Jun 14 01:00:17 pippo kernel: CPU 0: Hot: hi: 186, btch: 31 usd:
116 Cold: hi: 62, btch: 15 usd: 61
Jun 14 01:00:17 pippo kernel: CPU 1: Hot: hi: 186, btch: 31 usd:
37 Cold: hi: 62, btch: 15 usd: 49
Jun 14 01:00:17 pippo kernel: CPU 2: Hot: hi: 186, btch: 31 usd:
107 Cold: hi: 62, btch: 15 usd: 61
Jun 14 01:00:17 pippo kernel: CPU 3: Hot: hi: 186, btch: 31 usd:
41 Cold: hi: 62, btch: 15 usd: 56
Jun 14 01:00:17 pippo kernel: CPU 4: Hot: hi: 186, btch: 31 usd:
13 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 5: Hot: hi: 186, btch: 31 usd:
167 Cold: hi: 62, btch: 15 usd: 59
Jun 14 01:00:17 pippo kernel: CPU 6: Hot: hi: 186, btch: 31 usd:
61 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 7: Hot: hi: 186, btch: 31 usd:
102 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: HighMem per-cpu:
Jun 14 01:00:17 pippo kernel: CPU 0: Hot: hi: 186, btch: 31 usd:
2 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 1: Hot: hi: 186, btch: 31 usd:
33 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 2: Hot: hi: 186, btch: 31 usd:
91 Cold: hi: 62, btch: 15 usd: 12
Jun 14 01:00:17 pippo kernel: CPU 3: Hot: hi: 186, btch: 31 usd:
172 Cold: hi: 62, btch: 15 usd: 7
Jun 14 01:00:17 pippo kernel: CPU 4: Hot: hi: 186, btch: 31 usd:
16 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 5: Hot: hi: 186, btch: 31 usd:
63 Cold: hi: 62, btch: 15 usd: 11
Jun 14 01:00:17 pippo kernel: CPU 6: Hot: hi: 186, btch: 31 usd:
83 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: CPU 7: Hot: hi: 186, btch: 31 usd:
41 Cold: hi: 62, btch: 15 usd: 0
Jun 14 01:00:17 pippo kernel: Active:63299 inactive:6580 dirty:93
writeback:0 unstable:0
Jun 14 01:00:17 pippo kernel: free:745742 slab:211387 mapped:4303
pagetables:412 bounce:0
Jun 14 01:00:17 pippo kernel: DMA free:3548kB min:68kB low:84kB
high:100kB active:0kB inactive:4kB present:16256kB pages_scanned:25
all_unreclaimable? yes
Jun 14 01:00:17 pippo kernel: lowmem_reserve[]: 0 873 4810 4810
Jun 14 01:00:17 pippo kernel: Normal free:3736kB min:3744kB low:4680kB
high:5616kB active:4616kB inactive:4672kB present:894080kB
pages_scanned:14277 all_unreclaimable? yes
Jun 14 01:00:17 pippo kernel: lowmem_reserve[]: 0 0 31496 31496
Jun 14 01:00:17 pippo kernel: HighMem free:2975684kB min:512kB
low:4736kB high:8960kB active:248580kB inactive:21644kB
present:4031488kB pages_scanned:0 all_unreclaimable? no
Jun 14 01:00:17 pippo kernel: lowmem_reserve[]: 0 0 0 0
Jun 14 01:00:17 pippo kernel: DMA: 7*4kB 2*8kB 1*16kB 1*32kB 2*64kB
0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3548kB
Jun 14 01:00:17 pippo kernel: Normal: 58*4kB 5*8kB 2*16kB 2*32kB
5*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3760kB
Jun 14 01:00:17 pippo kernel: HighMem: 7371*4kB 5859*8kB 7946*16kB
6213*32kB 4077*64kB 3064*128kB 1859*256kB 889*512kB 452*1024kB
161*2048kB 48*4096kB = 2975684kB
Jun 14 01:00:17 pippo kernel: Swap cache: add 32, delete 32, find
16/19, race 0+0
Jun 14 01:00:17 pippo kernel: Free swap = 3903752kB
Jun 14 01:00:17 pippo kernel: Total swap = 3903752kB
Jun 14 01:00:17 pippo kernel: Free swap: 3903752kB
Jun 14 01:00:17 pippo kernel: 1245183 pages of RAM
Jun 14 01:00:17 pippo kernel: 1015807 pages of HIGHMEM
Jun 14 01:00:17 pippo kernel: 207787 reserved pages
Jun 14 01:00:17 pippo kernel: 19533 pages shared
Jun 14 01:00:17 pippo kernel: 0 pages swap cached
Jun 14 01:00:17 pippo kernel: 93 pages dirty
Jun 14 01:00:17 pippo kernel: 0 pages writeback
Jun 14 01:00:17 pippo kernel: 4303 pages mapped
Jun 14 01:00:17 pippo kernel: 211387 pages slab
Jun 14 01:00:17 pippo kernel: 412 pages pagetables
Jun 14 01:00:17 pippo kernel: Out of memory: kill process 6873 (slapd)
score 5003 or a child
Jun 14 01:00:17 pippo kernel: Killed process 6873 (slapd)
Anyone can help me in understanding what went wrong? And if I need to
upgrade to last kernel version?
Thanks in advance
On Sat, Jun 14, 2008 at 12:13 PM, Marco Barbero <[email protected]> wrote:
> samba/slapd/heartbeat/drbd/mon
...
> Jun 14 01:00:17 pippo kernel: Out of memory: kill process 6873 (slapd)
> score 5003 or a child
> Jun 14 01:00:17 pippo kernel: Killed process 6873 (slapd)
>
> Anyone can help me in understanding what went wrong? And if I need to
> upgrade to last kernel version?
One or more processes on your system used too much memory. Dumping the
output of the following command periodically (e.g. every 10 minutes)
to a file will tell you which process is using too much memory:
{ ps aux | head -n 1; ps aux | sort -n +4 | tail -n 10; }
Bart.
2008/6/14 Bart Van Assche <[email protected]>:
> On Sat, Jun 14, 2008 at 12:13 PM, Marco Barbero <[email protected]> wrote:
>> samba/slapd/heartbeat/drbd/mon
> ...
>> Jun 14 01:00:17 pippo kernel: Out of memory: kill process 6873 (slapd)
>> score 5003 or a child
>> Jun 14 01:00:17 pippo kernel: Killed process 6873 (slapd)
>>
>> Anyone can help me in understanding what went wrong? And if I need to
>> upgrade to last kernel version?
>
> One or more processes on your system used too much memory. Dumping the
> output of the following command periodically (e.g. every 10 minutes)
> to a file will tell you which process is using too much memory:
> { ps aux | head -n 1; ps aux | sort -n +4 | tail -n 10; }
I'll do it. Have to say that this box had an uptime of 1 month and
never show issues. Also oom was invoked in the night when load is
minimum.
Could be related to slab and the quicklist leak?
On Sat, Jun 14, 2008 at 7:34 PM, Marco Barbero <[email protected]> wrote:
>> One or more processes on your system used too much memory. Dumping the
>> output of the following command periodically (e.g. every 10 minutes)
>> to a file will tell you which process is using too much memory:
>
>> { ps aux | head -n 1; ps aux | sort -n +4 | tail -n 10; }
>
> I'll do it. Have to say that this box had an uptime of 1 month and
> never show issues. Also oom was invoked in the night when load is
> minimum.
> Could be related to slab and the quicklist leak?
Are you running the kernel in 32-bit mode ? Wouldn't it be better to
run a 64 bit kernel with 4 GB RAM ?
Bart.
On Sat, Jun 14, 2008 at 05:31:18PM +0200, Bart Van Assche wrote:
> On Sat, Jun 14, 2008 at 12:13 PM, Marco Barbero <[email protected]> wrote:
> > samba/slapd/heartbeat/drbd/mon
> ...
> > Jun 14 01:00:17 pippo kernel: Out of memory: kill process 6873 (slapd)
> > score 5003 or a child
> > Jun 14 01:00:17 pippo kernel: Killed process 6873 (slapd)
> >
> > Anyone can help me in understanding what went wrong? And if I need to
> > upgrade to last kernel version?
>
> One or more processes on your system used too much memory. Dumping the
> output of the following command periodically (e.g. every 10 minutes)
> to a file will tell you which process is using too much memory:
>
> { ps aux | head -n 1; ps aux | sort -n +4 | tail -n 10; }
There were lots of swap free:
Jun 14 01:00:17 pippo kernel: Free swap = 3903752kB
Jun 14 01:00:17 pippo kernel: Total swap = 3903752kB
Jun 14 01:00:17 pippo kernel: Free swap: 3903752kB
Something is wrong.
On Mon, 2008-06-16 at 16:55 -0300, Marcelo Tosatti wrote:
> On Sat, Jun 14, 2008 at 05:31:18PM +0200, Bart Van Assche wrote:
> > On Sat, Jun 14, 2008 at 12:13 PM, Marco Barbero <[email protected]> wrote:
> > > samba/slapd/heartbeat/drbd/mon
> > ...
> > > Jun 14 01:00:17 pippo kernel: Out of memory: kill process 6873 (slapd)
> > > score 5003 or a child
> > > Jun 14 01:00:17 pippo kernel: Killed process 6873 (slapd)
> > >
> > > Anyone can help me in understanding what went wrong? And if I need to
> > > upgrade to last kernel version?
> >
> > One or more processes on your system used too much memory. Dumping the
> > output of the following command periodically (e.g. every 10 minutes)
> > to a file will tell you which process is using too much memory:
> >
> > { ps aux | head -n 1; ps aux | sort -n +4 | tail -n 10; }
>
> There were lots of swap free:
>
> Jun 14 01:00:17 pippo kernel: Free swap = 3903752kB
> Jun 14 01:00:17 pippo kernel: Total swap = 3903752kB
> Jun 14 01:00:17 pippo kernel: Free swap: 3903752kB
>
> Something is wrong.
It was a normal GFP_KERNEL allocation:
Jun 14 01:00:17 pippo kernel: sendmail-mta invoked oom-killer:
gfp_mask=0x800d0, order=0, oomkilladj=0
And the system /is/ low on memory:
Jun 14 01:00:17 pippo kernel: DMA free:3548kB min:68kB low:84kB
high:100kB active:0kB inactive:4kB present:16256kB pages_scanned:25
all_unreclaimable? yes
Jun 14 01:00:17 pippo kernel: lowmem_reserve[]: 0 873 4810 4810
Jun 14 01:00:17 pippo kernel: Normal free:3736kB min:3744kB low:4680kB
high:5616kB active:4616kB inactive:4672kB present:894080kB
pages_scanned:14277 all_unreclaimable? yes
Jun 14 01:00:17 pippo kernel: lowmem_reserve[]: 0 0 31496 31496
Jun 14 01:00:17 pippo kernel: HighMem free:2975684kB min:512kB
low:4736kB high:8960kB active:248580kB inactive:21644kB
present:4031488kB pages_scanned:0 all_unreclaimable? no
Still high memory available, and swap, but ZONE_DMA and ZONE_NORMAL were
all used and "all_unreclaimable? yes" . So __alloc_pages(GFP_KERNEL)
fails.
It might be tcp buffers- see
http://marc.info/?l=linux-netdev&m=121362441431941&w=2
In which case, use this workaround:
echo "98304 131072 196608" > /proc/sys/net/ipv4/tcp_mem
Mike.