2005-02-08 22:57:19

by Cliff White

[permalink] [raw]
Subject: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)


Running 2.6.10-ac10 on the STP 1-CPU machines, we don't seem to be able to complete
a kernbench run without hitting the OOM-killer. ( kernbench is multiple kernel compiles,
of course ) Machine is 800 mhz PIII with 1GB memory. We reduce memory for some of the runs.

Typical results:

stp1-001 login: oom-killer: gfp_mask=0xd2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages: 14084kB (0kB HighMem)
Active:95617 inactive:4153 dirty:0 writeback:78 unstable:0 free:3521 slab:10320
mapped:99590 pagetables:12514
DMA free:1860kB min:88kB low:108kB high:132kB active:3512kB inactive:3428kB pres
ent:16384kB pages_scanned:3318 all_unreclaimable? no
protections[]: 0 0 0
Normal free:12224kB min:2800kB low:3500kB high:4200kB active:378956kB inactive:1
3184kB present:506880kB pages_scanned:10146 all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:
0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 375*4kB 33*8kB 2*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048
kB 0*4096kB = 1860kB
Normal: 2194*4kB 107*8kB 24*16kB 1*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB
1*2048kB 0*4096kB = 12224kB
HighMem: empty
Swap cache: add 14113357, delete 14112531, find 151467/1660782, race 427+1738
Out of Memory: Killed process 14970 (cc1).
-------------------------
It looks like some oom-related stuff went into -ac10, will try retest with
-ac9 and -ac10, see what happens. Lemme know if we can do more

cliffw


--
"Ive always gone through periods where I bolt upright at four in the morning;
now at least theres a reason." -Michael Feldman


2005-02-09 01:36:36

by Andries Brouwer

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

On Tue, Feb 08, 2005 at 02:57:07PM -0800, cliff white wrote:

> Running 2.6.10-ac10 on the STP 1-CPU machines, we don't seem to be able to complete
> a kernbench run without hitting the OOM-killer. ( kernbench is multiple kernel compiles,
> of course ) Machine is 800 mhz PIII with 1GB memory. We reduce memory for some of the runs.
>
> Typical results:
>
> Out of Memory: Killed process 14970 (cc1).
> -------------------------
> It looks like some oom-related stuff went into -ac10, will try retest with
> -ac9 and -ac10, see what happens. Lemme know if we can do more

I am always curious to hear how things are when you set
/proc/sys/vm/overcommit_memory to 2
(and possibly /proc/sys/vm/overcommit_ratio to something
appropriate).

Andries

2005-02-09 15:48:10

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

On Tue, Feb 08, 2005 at 02:57:07PM -0800, cliff white wrote:
>
> Running 2.6.10-ac10 on the STP 1-CPU machines, we don't seem to be able to complete
> a kernbench run without hitting the OOM-killer. ( kernbench is multiple kernel compiles,
> of course ) Machine is 800 mhz PIII with 1GB memory. We reduce memory for some of the runs.

Cliff,

Please try recent v2.6.11-rc3, they include a series of OOM killer fixes from Andrea et all.

Thanks.

2005-02-17 20:49:39

by Cliff White

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

On Wed, 9 Feb 2005 10:12:06 -0200
Marcelo Tosatti <[email protected]> wrote:

> On Tue, Feb 08, 2005 at 02:57:07PM -0800, cliff white wrote:
> >
> > Running 2.6.10-ac10 on the STP 1-CPU machines, we don't seem to be able to complete
> > a kernbench run without hitting the OOM-killer. ( kernbench is multiple kernel compiles,
> > of course ) Machine is 800 mhz PIII with 1GB memory. We reduce memory for some of the runs.
>
> Cliff,
>
> Please try recent v2.6.11-rc3, they include a series of OOM killer fixes from Andrea et all.
>

Sorry for the delay in response. Recent -bk runs still show this problem, for example:
http://khack.osdl.org/stp/300713/logs/TestRunFailed.console.log.txt
( patch-2.6.11-rc3-bk4 )

cliffw

> Thanks.
>


--
"Ive always gone through periods where I bolt upright at four in the morning;
now at least theres a reason." -Michael Feldman

2005-02-18 16:56:07

by Cliff White

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

> On Tue, Feb 08, 2005 at 02:57:07PM -0800, cliff white wrote:
>
> > Running 2.6.10-ac10 on the STP 1-CPU machines, we don't seem to be able to
> complete
> > a kernbench run without hitting the OOM-killer. ( kernbench is multiple ker
> nel compiles,
> > of course ) Machine is 800 mhz PIII with 1GB memory. We reduce memory for s
> ome of the runs.
> >
> > Typical results:
> >
> > Out of Memory: Killed process 14970 (cc1).
> > -------------------------
> > It looks like some oom-related stuff went into -ac10, will try retest with
> > -ac9 and -ac10, see what happens. Lemme know if we can do more
>
> I am always curious to hear how things are when you set
> /proc/sys/vm/overcommit_memory to 2
> (and possibly /proc/sys/vm/overcommit_ratio to something
> appropriate).

Okay, with just vm.overcommit=2, things are still bad:
http://khack.osdl.org/stp/300854/logs/TestRunFailed.console.log.txt

Suggestion for vm.overcommit_ratio ?
Or should i repeat with later -ac ?
cliffw

-----------Some output---------------
Free pages: 8872kB (0kB HighMem)

Active:14865 inactive:4118 dirty:0 writeback:629 unstable:0 free:2218 slab:11489 mapped:32027 pagetables:13800

DMA free:1224kB min:128kB low:160kB high:192kB active:552kB inactive:196kB present:16384kB pages_scanned:401 all_unreclaimable? no

protections[]: 0 0 0

Normal free:7648kB min:1920kB low:2400kB high:2880kB active:58908kB inactive:16276kB present:245760kB pages_scanned:1395 all_unreclaimable? no

protections[]: 0 0 0

HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no

protections[]: 0 0 0

DMA: 240*4kB 17*8kB 2*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1224kB

Normal: 1348*4kB 46*8kB 54*16kB 6*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 7648kB

HighMem: empty

Swap cache: add 23226854, delete 23224756, find 324015/2933249, race 2549+2365

Out of Memory: Killed process 14667 (rpm).

oom-killer: gfp_mask=0xd2

DMA per-cpu:

cpu 0 hot: low 2, high 6, batch 1

cpu 0 cold: low 0, high 2, batch 1

Normal per-cpu:

cpu 0 hot: low 30, high 90, batch 15

cpu 0 cold: low 0, high 30, batch 15

HighMem per-cpu: empty

------------------
cliffw



>
> Andries
>

2005-02-18 20:13:49

by Alan

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

On Gwe, 2005-02-18 at 16:55, Cliff White wrote:
> Okay, with just vm.overcommit=2, things are still bad:
> http://khack.osdl.org/stp/300854/logs/TestRunFailed.console.log.txt
>
> Suggestion for vm.overcommit_ratio ?
> Or should i repeat with later -ac ?

Thats showing up problems in the core code still. The OOM in this case
is because the kernel is deciding it is out of memory when it's merely
constipated with dirty pages for disk write by the look of it.

Alan

2005-02-21 17:15:58

by Cliff White

[permalink] [raw]
Subject: Re: 2.6.10-ac12 + kernbench == oom-killer: (OSDL)

On Fri, 18 Feb 2005 20:07:33 +0000
Alan Cox <[email protected]> wrote:

> On Gwe, 2005-02-18 at 16:55, Cliff White wrote:
> > Okay, with just vm.overcommit=2, things are still bad:
> > http://khack.osdl.org/stp/300854/logs/TestRunFailed.console.log.txt
> >
> > Suggestion for vm.overcommit_ratio ?
> > Or should i repeat with later -ac ?
>
> Thats showing up problems in the core code still. The OOM in this case
> is because the kernel is deciding it is out of memory when it's merely
> constipated with dirty pages for disk write by the look of it.

Okay, same question - is there a tweak or a patch I can try here?
cliffw

>
> Alan
>


--
"Ive always gone through periods where I bolt upright at four in the morning;
now at least theres a reason." -Michael Feldman