2004-04-03 00:05:41

by Kurt Garloff

[permalink] [raw]
Subject: oom-killer adjustments

Hi Andrew, Rik,

the oom-killer in Linux does not seem to cope well with all situations.

This is no wonder: It can't know what the admin really considers as
important, so it can't always take the right decision. It can only
do some heuristics and make sure that the killing at least will mitigate
the OOM situation. It takes some other factors into account and of course
one could argue about tweaking it some more. But probably what we have
nowadays is as good as it can get.

Attached patch does not try to tweak the heuristics, but leaves it as it
is. Instead it allows the sysadmin (CAP_SYS_RESOURCE) to adjust the
outcome of the oom score calculation by a power of two in
1 << [-16 .. 15], thus allowing the sysadmin to mark the importance of
a process.
One could argue that then we should get rid of the heuristics completely.
I disagree: If a process start to leak mem after some time, the score
still increases which is an important property of the current score
calculation that should not be dropped in favour of static values.
Thus the adjustment factor.
Tha adjustment is done per process via /proc/$PID/oom_adj

The patch does two other things:
* It also exports the oom_score to userspace in /proc/$PID/oom_score
* It does try to kill the process by SIGTERM first, before sending
SIGKILL. If we're lucky, the process exits gracefully before the
SIGKILL is needed.

Please consider applying,
--
Kurt Garloff <[email protected]> Cologne, DE
SUSE LINUX AG, Nuernberg, DE SUSE Labs (Head)


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2004-04-03 18:12:21

by Rik van Riel

[permalink] [raw]
Subject: Re: oom-killer adjustments

On Sat, 3 Apr 2004, Kurt Garloff wrote:

> 1 << [-16 .. 15], thus allowing the sysadmin to mark the importance of
> a process.

Shouldn't such an adjustment be inherited at fork time,
if we decide we want it in the kernel ?

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

2004-04-03 19:54:44

by Kurt Garloff

[permalink] [raw]
Subject: Re: oom-killer adjustments

Hi Rik,

On Sat, Apr 03, 2004 at 01:12:07PM -0500, Rik van Riel wrote:
> Shouldn't such an adjustment be inherited at fork time,
> if we decide we want it in the kernel ?

It is inherited. Why do you think it's not?

Regards,
--
Kurt Garloff <[email protected]> Cologne, DE
SUSE LINUX AG, Nuernberg, DE SUSE Labs (Head)


Attachments:
(No filename) (373.00 B)
(No filename) (189.00 B)
Download all attachments

2004-04-03 20:21:13

by Rik van Riel

[permalink] [raw]
Subject: Re: oom-killer adjustments

On Sat, 3 Apr 2004, Kurt Garloff wrote:
> On Sat, Apr 03, 2004 at 01:12:07PM -0500, Rik van Riel wrote:
> > Shouldn't such an adjustment be inherited at fork time,
> > if we decide we want it in the kernel ?
>
> It is inherited. Why do you think it's not?

Oh duh, dup_task_struct() copies everything in 2.6.

ISTR 2.2 or 2.4 was slightly different, copying
(some?) things by hand...

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan