LinuxLists.cc - OOM killer hints (was: Re: Kernel call chain search tool?)

2002-09-27 00:24:59

Subject: OOM killer hints (was: Re: Kernel call chain search tool?)

Rob Landley wrote:
>
> On Thursday 26 September 2002 12:17 pm, Dan Kegel wrote:
>
> > If only the darn program didn't have so many threads, RLIMIT_AS
> > or the no-overcommit patch would be perfect. I unfortunately can't
> > get rid of the threads, so I'm stuck trying to figure out some way
> > to kill the right program when the system gets low on memory.
> >
> > Maybe I should look at giving the OOM killer hints?
>
> The OOM killer should certainly know about threads and thread groups. If you
> kill one thread, you generally have to kill the whole group because there's
> no way of knowing if that thread was holding a futex or otherwise custodian
> of critical data and thus you just threw the program into la-la land.

The OOM killer gets that part right; it kills all threads that share the
same mm. Where it screws up is in picking the process to kill.
This is understandable, since it's a tough problem.

Hey, how about this: I could teach the OOM killer to look at
RLIMIT_RSS. Processes which were at or nearly at their RLIMIT_RSS
would be killed first. That would be more generally useful than
my hacky little patch, and it would be even tinier. Like this, say:

--- oom_kill.c.orig Thu Sep 26 17:31:12 2002
+++ oom_kill.c Thu Sep 26 17:36:44 2002
@@ -86,6 +86,15 @@
points *= 2;

/*
+ * Processes at or near their RSS or AS limits are probably causing
+ * trouble, so double their badness points.
+ */
+ if (((3 * p->mm->rss) / 4) >= (p->rlim[RLIMIT_RSS].rlim_max >>
PAGE_SHIFT))
+ points *= 2;
+ if (((3 * p->mm->total_vm) / 4) >= (p->rlim[RLIMIT_AS].rlim_max >>
PAGE_SHIFT))
+ points *= 2;
+
+ /*
* Superuser processes are usually more important, so we make it
* less likely that we kill those.
*/

How's that look?
- Dan