2002-01-04 20:33:32

by Phil Oester

[permalink] [raw]
Subject: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On 2.4.17, I can't make -j bzImage without OOM kicking in. Relatively
light .config here - bzImage compiles to less than 1mb.

Seems with 1 gb of RAM and swap, the box should be able to handle this
(box is dual P3 600 btw).

Is this unreasonable? How much RAM should it take to accomplish this???

-Phil Oester


2002-01-04 21:03:15

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Fri, 4 Jan 2002 12:32:27 -0800
"Phil Oester" <[email protected]> wrote:

> On 2.4.17, I can't make -j bzImage without OOM kicking in. Relatively
> light .config here - bzImage compiles to less than 1mb.
>
> Seems with 1 gb of RAM and swap, the box should be able to handle this
> (box is dual P3 600 btw).
>
> Is this unreasonable? How much RAM should it take to accomplish this???

You should give a bit more info on that, especially vmstat and the like.
I cannot reproduce this. Neither on 1GB/256MB nor on 2GB/256MB RAM/SWAP.
(P3-1GHz, dual SMP, 2.4.17)

Regards,
Stephan

2002-01-05 00:42:55

by Nicholas Knight

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Friday 04 January 2002 01:02 pm, Stephan von Krawczynski wrote:
> On Fri, 4 Jan 2002 12:32:27 -0800
>
> "Phil Oester" <[email protected]> wrote:
> > On 2.4.17, I can't make -j bzImage without OOM kicking in.
> > Relatively light .config here - bzImage compiles to less than 1mb.
> >
> > Seems with 1 gb of RAM and swap, the box should be able to handle
> > this (box is dual P3 600 btw).
> >
> > Is this unreasonable? How much RAM should it take to accomplish
> > this???
>
> You should give a bit more info on that, especially vmstat and the
> like. I cannot reproduce this. Neither on 1GB/256MB nor on 2GB/256MB
> RAM/SWAP. (P3-1GHz, dual SMP, 2.4.17)
>


I have absilutely no trouble reproducing on an 800MHz Athlon with 256MB
RAM/256MB swap on 2.4.17

The one catch is that -j is specified without a number.

from man make:
-j jobs
Specifies the number of jobs (commands) to run
simultaneously. If there is more than one -j
option, the last one is effective.
**If the -j option is given without an argument, make will not limit
the number of jobs that can run simultaneously.**

(emphasis mine)

Hence, unlimited number of jobs, theoreticaly unlimited amount of
memory usage.
The last number of processes I saw in top before the system was
basically dead and I just hit A-SYSRQ-S and A-SYSRQ-B was 416, and all
the top processes were make or cc

Somehow I doubt this is a kernel issue and is instead a make and user
issue. A make issue because it's probably poor design to have an option
that's specified with a number be normaly harmless and useful, be
potentialy lethal when the number is left off, so if you forget the
number, your system is dead. A user issue because it seems the user is
using the option without fully comprehending the consequences.

> Regards,
> Stephan
>

2002-01-05 01:24:38

by Phil Oester

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Fri, Jan 04, 2002 at 04:42:43PM -0800, Nicholas Knight wrote:
> The one catch is that -j is specified without a number.

[snip superfluous description of what 'make -j' implies]

> number, your system is dead. A user issue because it seems the user is
> using the option without fully comprehending the consequences.

eh? Trust me - i understand the implications of make -j. It's not an unreasonable test, especially on a machine with 1gb ram/swap. For reference, read Rik's email regarding his reverse VM patch:

http://marc.theaimsgroup.com/?l=linux-kernel&m=101007711817127&w=2

Might be enlightening

-Phil

2002-01-05 12:30:57

by Luigi Genoni

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

No troubles to reproduce this here, on sparc64 !GM ran/1GB swap,
and on dualathlon 768MB RAM 1.5GB swap, and on athlon 1GBRAM/1GBSWAP

But this is not a kernel issue, it is simply that
too many gcc processes are runned at the same time because the source
files are too many.

On Fri, 4 Jan 2002, Nicholas Knight wrote:

> On Friday 04 January 2002 01:02 pm, Stephan von Krawczynski wrote:
> > On Fri, 4 Jan 2002 12:32:27 -0800
> >
> > "Phil Oester" <[email protected]> wrote:
> > > On 2.4.17, I can't make -j bzImage without OOM kicking in.
> > > Relatively light .config here - bzImage compiles to less than 1mb.
> > >
> > > Seems with 1 gb of RAM and swap, the box should be able to handle
> > > this (box is dual P3 600 btw).
> > >
> > > Is this unreasonable? How much RAM should it take to accomplish
> > > this???
> >
> > You should give a bit more info on that, especially vmstat and the
> > like. I cannot reproduce this. Neither on 1GB/256MB nor on 2GB/256MB
> > RAM/SWAP. (P3-1GHz, dual SMP, 2.4.17)
> >
>
>
> I have absilutely no trouble reproducing on an 800MHz Athlon with 256MB
> RAM/256MB swap on 2.4.17
>
> The one catch is that -j is specified without a number.
>
> from man make:
> -j jobs
> Specifies the number of jobs (commands) to run
> simultaneously. If there is more than one -j
> option, the last one is effective.
> **If the -j option is given without an argument, make will not limit
> the number of jobs that can run simultaneously.**
>
> (emphasis mine)
>
> Hence, unlimited number of jobs, theoreticaly unlimited amount of
> memory usage.
> The last number of processes I saw in top before the system was
> basically dead and I just hit A-SYSRQ-S and A-SYSRQ-B was 416, and all
> the top processes were make or cc
>
> Somehow I doubt this is a kernel issue and is instead a make and user
> issue. A make issue because it's probably poor design to have an option
> that's specified with a number be normaly harmless and useful, be
> potentialy lethal when the number is left off, so if you forget the
> number, your system is dead. A user issue because it seems the user is
> using the option without fully comprehending the consequences.
>
> > Regards,
> > Stephan
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-01-05 15:17:57

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Fri, 4 Jan 2002 17:24:18 -0800
Phil Oester <[email protected]> wrote:

> On Fri, Jan 04, 2002 at 04:42:43PM -0800, Nicholas Knight wrote:
> > The one catch is that -j is specified without a number.
>
> [snip superfluous description of what 'make -j' implies]
>
> > number, your system is dead. A user issue because it seems the user is
> > using the option without fully comprehending the consequences.
>
> eh? Trust me - i understand the implications of make -j. It's not an
unreasonable test, especially on a machine with 1gb ram/swap. For reference,
read Rik's email regarding his reverse VM patch:>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=101007711817127&w=2
>
> Might be enlightening

I guess this testcase is somewhat driving in the direction of Martins test with
some setis running, meaning it has a lot of standard processes that need files
and try to work out something. Can you try Martins patch at your side, redo the
-j story and give us a result? I attached it for an easy go :-)

Thanks,
Stephan


Attachments:
vmscan.patch.2.4.17.c (1.25 kB)

2002-01-05 15:20:37

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Fri, 4 Jan 2002 16:42:43 -0800
Nicholas Knight <[email protected]> wrote:


> I have absilutely no trouble reproducing on an 800MHz Athlon with 256MB
> RAM/256MB swap on 2.4.17

The simple question is: is the RAM sufficient at all to spawn such a lot of cc
processes? In my setup I get around 1000 concurrently working during -j. This
sounds like a real problem for 256/256, or not?

Regards,
Stephan

2002-01-05 17:57:32

by Nicholas Knight

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Saturday 05 January 2002 07:19 am, Stephan von Krawczynski wrote:
> On Fri, 4 Jan 2002 16:42:43 -0800
>
> Nicholas Knight <[email protected]> wrote:
> > I have absilutely no trouble reproducing on an 800MHz Athlon with
> > 256MB RAM/256MB swap on 2.4.17
>
> The simple question is: is the RAM sufficient at all to spawn such a
> lot of cc processes? In my setup I get around 1000 concurrently
> working during -j. This sounds like a real problem for 256/256, or
> not?

Matter of scale, did you try a full kernel build with make -j bzImage
using whatever your normal config is?
I still believe this is an innappropriate test, sure if you have tons
of RAM and swap it may eventualy complete (though if the swapfile is
very active and this is tried on a 2.4 kernel, the amount of CPU time
the compile will actually get will likely be unpredictable at best.)
This is an option that does nothing less than flood the system with
hundreds or thousands of processes that on any large compile will
simply not allow the system to survive intact without killing all the
processes. Does the OOM killer even work correctly yet? Or does it
still try to kill init at times? And for that matter, what order does
it kill in?

>
> Regards,
> Stephan

2002-01-05 21:43:52

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

Phil Oester <[email protected]> writes:

> On Fri, Jan 04, 2002 at 04:42:43PM -0800, Nicholas Knight wrote:
> > The one catch is that -j is specified without a number.
>
> [snip superfluous description of what 'make -j' implies]
>
> > number, your system is dead. A user issue because it seems the user is
> > using the option without fully comprehending the consequences.
>
> eh? Trust me - i understand the implications of make -j. It's not an
> unreasonable test, especially on a machine with 1gb ram/swap. For reference,
> read Rik's email regarding his reverse VM patch:
>
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=101007711817127&w=2
>
> Might be enlightening

Yes. It sounds like he Rick slowed down fork enough the system didn't fall
over. There may be some other policy changes as well. But my hunch is that
it is a fork speed thing. If all that happens is that the system hits
OOM when subjected to an unreasonable load I don't see this as a problem.

The truly interesting question is what happens when you and more swap.
With sufficient swap will it work?

Eric

2002-01-06 14:41:33

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Sat, 5 Jan 2002 09:57:17 -0800
Nicholas Knight <[email protected]> wrote:

> On Saturday 05 January 2002 07:19 am, Stephan von Krawczynski wrote:
> > On Fri, 4 Jan 2002 16:42:43 -0800
> >
> > Nicholas Knight <[email protected]> wrote:
> > > I have absilutely no trouble reproducing on an 800MHz Athlon with
> > > 256MB RAM/256MB swap on 2.4.17
> >
> > The simple question is: is the RAM sufficient at all to spawn such a
> > lot of cc processes? In my setup I get around 1000 concurrently
> > working during -j. This sounds like a real problem for 256/256, or
> > not?
>
> Matter of scale, did you try a full kernel build with make -j bzImage
> using whatever your normal config is?

Yes, of course, and it works at my side. Worked with 1GB RAM/256MB swap, works
now with 2GB RAM/256MB swap on stock 2.4.17.

> I still believe this is an innappropriate test, sure if you have tons
> of RAM and swap it may eventualy complete

I never saw it not completing on my box with 2.4.17, regardless of what I do in
the mean time (writing mails or the like). Of course system performance drops
somehow down when load reaches about 150, but I think this can be expected ;-)

Regards,
Stephan

2002-01-07 06:22:43

by Phil Oester

[permalink] [raw]
Subject: RE: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

I've rerun this test a number of times, and cannot reliably reproduce
the OOM - though it still does OOM occasionally. It never OOM's right
after a bootup - usually the greatest chance of OOM is after 2 or 3
consecutive runs without a reboot. Once it even froze the box and
required a powercycle.

I'm surprised you cannot OOM with 1gb RAM/256MB swap, as sometimes I'm
over 900MB in swap - did you try consecutive runs, or just once and then
reboot between each run?

On a side note, there seems to be some debate as to whether this is a
valid test. The detractors primarily claim that 'make -j' just
overloads the machine with too many processes and therefore is setting
it up to fail. My position has always been that the kernel
_should_not_OOM_ under this test due to the ~2gb of ~RAM being thrown at
it. It may die for any number of other reasons, but OOM shouldn't be
one of them. In other words, either the OOM killer may be too
aggressive here, or the kernel isn't reclaiming inactive RAM under heavy
load.

Haven't yet tried Martin's patch - though since I can't reliably produce
the OOM, testing it wouldn't help much.

-Phil Oester


-----Original Message-----
From: Stephan von Krawczynski [mailto:[email protected]]
Sent: Saturday, January 05, 2002 7:17 AM
To: Phil Oester
Cc: [email protected]; [email protected]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

I guess this testcase is somewhat driving in the direction of Martins
test with
some setis running, meaning it has a lot of standard processes that need
files
and try to work out something. Can you try Martins patch at your side,
redo the
-j story and give us a result? I attached it for an easy go :-)

Thanks,
Stephan


2002-01-07 14:24:50

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

On Sun, 6 Jan 2002 22:22:16 -0800
"Phil Oester" <[email protected]> wrote:

> I've rerun this test a number of times, and cannot reliably reproduce
> the OOM - though it still does OOM occasionally. It never OOM's right
> after a bootup - usually the greatest chance of OOM is after 2 or 3
> consecutive runs without a reboot. Once it even froze the box and
> required a powercycle.
>
> I'm surprised you cannot OOM with 1gb RAM/256MB swap, as sometimes I'm
> over 900MB in swap - did you try consecutive runs, or just once and then
> reboot between each run?

I tried just about everything I could think of and it never went in OOM. Even
the first test I did were with several days uptime - meaning far away from
"cleaning" reboot. I hate reboot :-)

> [...]
> Haven't yet tried Martin's patch - though since I can't reliably produce
> the OOM, testing it wouldn't help much.

Well, take the other side: if you do not manage to OOM afterwards, even at the
tenth consecutive try, there is probably something about the patch ...

Regards,
Stephan

2002-01-08 05:13:01

by Phil Oester

[permalink] [raw]
Subject: RE: 1gb RAM + 1gb SWAP + make -j bzImage = OOM

The vmscan patch doesn't seem to help in the 'make -j' testcase.

Here's time of a couple runs:

2.4.17 vanilla

real 32m2.097s
user 9m51.800s
sys 3m47.700s

real 19m45.696s
user 9m55.820s
sys 2m32.170s

2.4.17 + vmscan patch

gave up waiting after 2 hours...never finished.

Unfortunately, box was not responsive enough to gather any useful
information. Perhaps not swapping enough???

-Phil

-----Original Message-----
From: Stephan von Krawczynski [mailto:[email protected]]
Sent: Monday, January 07, 2002 6:24 AM
To: Phil Oester
Cc: [email protected]; [email protected]
Subject: Re: 1gb RAM + 1gb SWAP + make -j bzImage = OOM


On Sun, 6 Jan 2002 22:22:16 -0800
"Phil Oester" <[email protected]> wrote:

> I've rerun this test a number of times, and cannot reliably reproduce
> the OOM - though it still does OOM occasionally. It never OOM's right
> after a bootup - usually the greatest chance of OOM is after 2 or 3
> consecutive runs without a reboot. Once it even froze the box and
> required a powercycle.
>
> I'm surprised you cannot OOM with 1gb RAM/256MB swap, as sometimes I'm
> over 900MB in swap - did you try consecutive runs, or just once and
then
> reboot between each run?

I tried just about everything I could think of and it never went in OOM.
Even
the first test I did were with several days uptime - meaning far away
from
"cleaning" reboot. I hate reboot :-)

> [...]
> Haven't yet tried Martin's patch - though since I can't reliably
produce
> the OOM, testing it wouldn't help much.

Well, take the other side: if you do not manage to OOM afterwards, even
at the
tenth consecutive try, there is probably something about the patch ...

Regards,
Stephan