2004-03-15 13:46:02

by Cress, Andrew R

[permalink] [raw]
Subject: RE: [cgl_discussion] Re: About Replaceable OOM Killer

Right, once it is really OOM, you are SOL :-) Really the only thing you can do at this point in the kernel is to not allocate any more memory, and functions that require more memory just don't work, and the recovery is to reboot..

IMO, the best answer is to detect a nearly-OOM, or trending-toward-OOM condition before it gets so bad.
This would allow userland actions, but would require more customization to tune the detection criteria, which would also imply a userland implementation of the monitoring. We've found that PCP works pretty well for this type of thing.
See http://oss.sgi.com/projects/pcp/ and http://pcp4cgl.sourceforge.net/. We did some work with this for CGL 1.0.

Andy Cress

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Machek
Sent: Monday, March 08, 2004 6:02 AM
To: Yury V. Umanets
Cc: Guo, Min; Tvrtko A. Ur?ulin; [email protected]; [email protected]
Subject: [cgl_discussion] Re: About Replaceable OOM Killer


Hi!

> > Though it hasn't been updated for a while because nobody cares...
> IMHO problem with OOM killer is that it always will do wrong choice. So,
> it should be either plugin based or allow to configure it and this
> means, that it will become more complex and buggy. Does not it mean,
> that OOM killer should be moved to user space?
>
> How about to export OOM event to user space? It might be done in manner
> like hotplug script is used.

When you are OOM, you really can't exec userland script...

--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

_______________________________________________
cgl_discussion mailing list
[email protected]
http://lists.osdl.org/mailman/listinfo/cgl_discussion


2004-03-15 18:24:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [cgl_discussion] Re: About Replaceable OOM Killer

Hi!

> Right, once it is really OOM, you are SOL :-) Really the only thing you can do at this point in the kernel is to not allocate any more memory, and functions that require more memory just don't work, and the recovery is to reboot..
>
> IMO, the best answer is to detect a nearly-OOM, or trending-toward-OOM condition before it gets so bad.
> This would allow userland actions, but would require more customization to tune the detection criteria, which would also imply a userland implementation of the monitoring. We've found that PCP works pretty well for this type of thing.
> See http://oss.sgi.com/projects/pcp/ and http://pcp4cgl.sourceforge.net/. We did some work with this for CGL 1.0.
>

Well, I see that as orthogonal.

With right daemon you may prevent OOM in most situations. Kernel still
needs some OOM killer for cases where faileure was just too fast, but
it can now be simpler (and that's good).
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-15 20:07:47

by Marcelo Tosatti

[permalink] [raw]
Subject: RE: [cgl_discussion] Re: About Replaceable OOM Killer


Yury, others,

I do think the a "replaceable OOM killer" is a valid and useful thing.

You should change your efforts to make such a feature be accepted in 2.6,
though.

On Mon, 15 Mar 2004, Cress, Andrew R wrote:

> Right, once it is really OOM, you are SOL :-) Really the only thing you can do at this point in the kernel is to not allocate any more memory, and functions that require more memory just don't work, and the recovery is to reboot..
>
> IMO, the best answer is to detect a nearly-OOM, or trending-toward-OOM condition before it gets so bad.
> This would allow userland actions, but would require more customization to tune the detection criteria, which would also imply a userland implementation of the monitoring. We've found that PCP works pretty well for this type of thing.
> See http://oss.sgi.com/projects/pcp/ and http://pcp4cgl.sourceforge.net/. We did some work with this for CGL 1.0.
>
> Andy Cress
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Machek
> Sent: Monday, March 08, 2004 6:02 AM
> To: Yury V. Umanets
> Cc: Guo, Min; Tvrtko A. Ur?ulin; [email protected]; [email protected]
> Subject: [cgl_discussion] Re: About Replaceable OOM Killer
>
>
> Hi!
>
> > > Though it hasn't been updated for a while because nobody cares...
> > IMHO problem with OOM killer is that it always will do wrong choice. So,
> > it should be either plugin based or allow to configure it and this
> > means, that it will become more complex and buggy. Does not it mean,
> > that OOM killer should be moved to user space?
> >
> > How about to export OOM event to user space? It might be done in manner
> > like hotplug script is used.
>
> When you are OOM, you really can't exec userland script...
>
>