2002-09-25 04:22:19

by Dan Kegel

[permalink] [raw]
Subject: Kernel call chain search tool?

<prelude>
I have a large multithreaded program that has a habit of using too
much memory, and as a safeguard, I want to kill it before it makes
the system unstable. The OOM killer often guesses wrong, and RLIMIT_AS
kills too soon because of the address space used up by the many thread
stacks.
So I'd like an RLIMIT_RSS that just kills the fat process.

There have been a couple patches to implement RLIMIT_RSS, e.g.
Peter Chubb's
http://marc.theaimsgroup.com/?l=linux-kernel&m=97892951101598&w=2
and the two from Rik, all of which are too complex for my needs
(and which only swap out instead of kill), so I guess I have to roll my
own.
</prelude>

Rik's patch checks rss in handle_mm_fault(); Peter's
checked it in do_swap_page() and do_anonymous_page().
As a kernel newbie, I don't have a feel for how
those calls relate to each other. Is there a tool
somewhere that will take a set of function names and
list all the kernel call chains that start in one of
the functions and end in another?

- Dan


2002-09-26 15:06:04

by Alan

[permalink] [raw]
Subject: Re: Kernel call chain search tool?

On Wed, 2002-09-25 at 05:36, Dan Kegel wrote:
> <prelude>
> I have a large multithreaded program that has a habit of using too
> much memory, and as a safeguard, I want to kill it before it makes
> the system unstable. The OOM killer often guesses wrong, and RLIMIT_AS
> kills too soon because of the address space used up by the many thread
> stacks.
> So I'd like an RLIMIT_RSS that just kills the fat process.

The RSS limit isnt a "kill" limit in Unix. its a residency limit. Its
preventing the obese process from getting more than a certain amount of
RAM as opposed to swap

2002-09-26 16:03:03

by Dan Kegel

[permalink] [raw]
Subject: Re: Kernel call chain search tool?

Alan Cox wrote:
>
> On Wed, 2002-09-25 at 05:36, Dan Kegel wrote:
> > <prelude>
> > I have a large multithreaded program that has a habit of using too
> > much memory, and as a safeguard, I want to kill it before it makes
> > the system unstable. The OOM killer often guesses wrong, and RLIMIT_AS
> > kills too soon because of the address space used up by the many thread
> > stacks.
> > So I'd like an RLIMIT_RSS that just kills the fat process.
>
> The RSS limit isnt a "kill" limit in Unix. its a residency limit. Its
> preventing the obese process from getting more than a certain amount of
> RAM as opposed to swap

Yeah. RLIMIT_RSS seemed like something I could hijack for the
purpose, though. And the code change was really small
( http://marc.theaimsgroup.com/?l=linux-kernel&m=103299570928378 ).

If only the darn program didn't have so many threads, RLIMIT_AS
or the no-overcommit patch would be perfect. I unfortunately can't
get rid of the threads, so I'm stuck trying to figure out some way
to kill the right program when the system gets low on memory.

Maybe I should look at giving the OOM killer hints?
- Dan

2002-09-27 00:02:31

by Rob Landley

[permalink] [raw]
Subject: Re: Kernel call chain search tool?

On Thursday 26 September 2002 12:17 pm, Dan Kegel wrote:

> If only the darn program didn't have so many threads, RLIMIT_AS
> or the no-overcommit patch would be perfect. I unfortunately can't
> get rid of the threads, so I'm stuck trying to figure out some way
> to kill the right program when the system gets low on memory.
>
> Maybe I should look at giving the OOM killer hints?

The OOM killer should certainly know about threads and thread groups. If you
kill one thread, you generally have to kill the whole group because there's
no way of knowing if that thread was holding a futex or otherwise custodian
of critical data and thus you just threw the program into la-la land.

> - Dan

Rob