LinuxLists.cc - OOM problems with 2.6.11-rc4

2005-03-15 20:48:42

Subject: OOM problems with 2.6.11-rc4

Hello. We have a server, currently running 2.6.11-rc4, that is
experiencing similar OOM problems to those described at
http://groups-beta.google.com/group/fa.linux.kernel/msg/9633559fea029f6e
and discussed further by several developers here (the summary is at
http://www.kerneltraffic.org/kernel-traffic/kt20050212_296.html#6) We
are running 2.6.11-rc4 because it contains the patches that Andrea
mentioned in the kerneltraffic link. The problem was present in 2.6.10
as well. We can try newer 2.6 kernels if it helps.

The machine in question is a dual Xeon system with 2 GB of RAM, 3.5 GB
of swap, and several TB of NFS exported filesystems. One notable point
is that this machine has been running in overcommit mode 2
(/proc/sys/vm/overcommit_memory = 2) and the OOM killer is still being
triggered, which is allegedly not supposed to be possible according to
the kerneltraffic.org document above. We had been running in overcommit
mode 0 until about a month ago, and experienced similar OOM problems
then as well.

The problem can be somewhat reliably triggered by running our backup
software on a particular filesystem. The backup software attempts to
keep the entire file list in memory, and this filesystem contains
several million files, so lots of memory is being allocated.

The server experienced these problems today and we captured the kernel
output, which is included below. Note that this machine has not used
very much swap at all, and we've never observed it completely running
out of swap.

Note that in this kernel output, the last memory dump is from the magic
SysRq key. By the time we've reached this point, the machine is
unresponsive and our next action is to trigger a sync+reboot via the
SysRq key.

File content:
057 slab:220275 mapped:12395 pagetables:118
DMA free:3588kB min:68kB low:84kB high:100kB active:0kB inactive:696kB present:16384kB pages_scanned:1203 all_unreclaimable? yes
lowmem_reserve[]: 0 880 2031
Normal free:3744kB min:3756kB low:4692kB high:5632kB active:0kB inactive:368kB present:901120kB pages_scanned:683 all_unreclaimable? yes
lowmem_reserve[]: 0 0 9212
HighMem free:896kB min:512kB low:640kB high:768kB active:50076kB inactive:1121156kB present:1179136kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
Normal: 0*4kB 10*8kB 1*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3744kB
HighMem: 82*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 896kB
Swap cache: add 2582, delete 2011, find 276/524, race 0+0
Free swap = 3610572kB
Total swap = 3615236kB
Out of Memory: Killed process 1188 (exim).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16

HighMem per-cpu:

cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16

Free pages: 9196kB (1856kB HighMem)
Active:12382 inactive:280459 dirty:214 writeback:0 unstable:0 free:2299 slab:220221 mapped:12256 pagetables:122
DMA free:3588kB min:68kB low:84kB high:100kB active:0kB inactive:736kB present:16384kB pages_scanned:5706 all_unreclaimable? yes
lowmem_reserve[]: 0 880 2031
Normal free:3752kB min:3756kB low:4692kB high:5632kB active:0kB inactive:368kB present:901120kB pages_scanned:6943 all_unreclaimable? yes
lowmem_reserve[]: 0 0 9212
HighMem free:1856kB min:512kB low:640kB high:768kB active:49528kB inactive:1120732kB present:1179136kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 3*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
Normal: 0*4kB 11*8kB 1*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3752kB
HighMem: 204*4kB 36*8kB 9*16kB 3*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1856kB
Swap cache: add 2582, delete 2011, find 276/524, race 0+0
Free swap = 3610572kB
Total swap = 3615236kB
Out of Memory: Killed process 17905 (terad).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1

Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16

Free pages: 21804kB (14464kB HighMem)
Active:9243 inactive:280452 dirty:214 writeback:0 unstable:0 free:5451 slab:220222 mapped:9110 pagetables:115
DMA free:3588kB min:68kB low:84kB high:100kB active:28kB inactive:708kB present:16384kB pages_scanned:5739 all_unreclaimable? yes
lowmem_reserve[]: 0 880 2031
Normal free:3752kB min:3756kB low:4692kB high:5632kB active:0kB inactive:368kB present:901120kB pages_scanned:6943 all_unreclaimable? yes
lowmem_reserve[]: 0 0 9212
HighMem free:14464kB min:512kB low:640kB high:768kB active:36944kB inactive:1120732kB present:1179136kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 3*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
Normal: 0*4kB 11*8kB 1*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3752kB
HighMem: 1824*4kB 572*8kB 122*16kB 4*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 14464kB
Swap cache: add 2582, delete 2047, find 276/524, race 0+0
Free swap = 3612564kB
Total swap = 3615236kB
Out of Memory: Killed process 19442 (terad).

SysRq : HELP : loglevel0-8 reBoot tErm kIll saK showMem Nice powerOff showPc unRaw Sync showTasks Unmount
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16

Free pages: 21380kB (14016kB HighMem)
Active:9400 inactive:280265 dirty:35 writeback:0 unstable:0 free:5345 slab:220381 mapped:9238 pagetables:139
DMA free:3604kB min:68kB low:84kB high:100kB active:80kB inactive:48kB present:16384kB pages_scanned:5968 all_unreclaimable? yes
lowmem_reserve[]: 0 880 2031
Normal free:3760kB min:3756kB low:4692kB high:5632kB active:60kB inactive:276kB present:901120kB pages_scanned:2019134 all_unreclaimable? yes
lowmem_reserve[]: 0 0 9212
HighMem free:14016kB min:512kB low:640kB high:768kB active:37460kB inactive:1120736kB present:1179136kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 5*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3604kB
Normal: 0*4kB 12*8kB 1*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3760kB
HighMem: 1712*4kB 572*8kB 122*16kB 4*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 14016kB
Swap cache: add 2582, delete 2047, find 276/524, race 0+0
Free swap = 3612564kB
Total swap = 3615236kB
Free swap: 3612564kB
524160 pages of RAM
294768 pages of HIGHMEM
5496 reserved pages
22159 pages shared
535 pages swap cached

--
Noah Meyerhans System Administrator
MIT Computer Science and Artificial Intelligence Laboratory

2005-03-15 21:59:02

by Sean

[permalink] [raw]

Subject: Re: OOM problems with 2.6.11-rc4

On Tue, March 15, 2005 3:44 pm, Noah Meyerhans said:
> Hello. We have a server, currently running 2.6.11-rc4, that is
> experiencing similar OOM problems to those described at
> http://groups-beta.google.com/group/fa.linux.kernel/msg/9633559fea029f6e
> and discussed further by several developers here (the summary is at
> http://www.kerneltraffic.org/kernel-traffic/kt20050212_296.html#6) We
> are running 2.6.11-rc4 because it contains the patches that Andrea
> mentioned in the kerneltraffic link. The problem was present in 2.6.10
> as well. We can try newer 2.6 kernels if it helps.
>
> The machine in question is a dual Xeon system with 2 GB of RAM, 3.5 GB
> of swap, and several TB of NFS exported filesystems. One notable point
> is that this machine has been running in overcommit mode 2
> (/proc/sys/vm/overcommit_memory = 2) and the OOM killer is still being
> triggered, which is allegedly not supposed to be possible according to
> the kerneltraffic.org document above. We had been running in overcommit
> mode 0 until about a month ago, and experienced similar OOM problems
> then as well.

We're seeing this on our dual Xeon box too, with 4 GB of RAM and 2GB of
swap (no NFS) using stock RHEL 4 kernel. The only thing that seems to
keep it from happening is setting /proc/sys/vm/vfs_cache_pressure to
10000.

Sean

2005-03-15 22:15:46

by Lee Revell

[permalink] [raw]

Subject: Re: OOM problems with 2.6.11-rc4

On Tue, 2005-03-15 at 16:56 -0500, Sean wrote:
> On Tue, March 15, 2005 3:44 pm, Noah Meyerhans said:
> > The machine in question is a dual Xeon system with 2 GB of RAM, 3.5 GB
> > of swap, and several TB of NFS exported filesystems. One notable point
> > is that this machine has been running in overcommit mode 2
> > (/proc/sys/vm/overcommit_memory = 2) and the OOM killer is still being
> > triggered, which is allegedly not supposed to be possible according to
> > the kerneltraffic.org document above. We had been running in overcommit
> > mode 0 until about a month ago, and experienced similar OOM problems
> > then as well.
>
> We're seeing this on our dual Xeon box too, with 4 GB of RAM and 2GB of
> swap (no NFS) using stock RHEL 4 kernel. The only thing that seems to
> keep it from happening is setting /proc/sys/vm/vfs_cache_pressure to
> 10000.

I suspect I hit this too on a smaller (UP) machine with 512MB RAM/512MB
swap while stress testing RT stuff with dbench and massively parallel
makes. The OOM seemed to trigger way before the machine filled up swap.
I dismissed it at the time, but maybe there's something there.

Lee

2005-03-15 23:53:06

Subject: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Subject: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

Subject: Re: OOM problems with 2.6.11-rc4

Subject: Re: OOM problems with 2.6.11-rc4

Attachments: