2015-11-04 20:32:17

by David Rientjes

[permalink] [raw]
Subject: [patch] mm, oom: add comment for why oom_adj exists

/proc/pid/oom_adj exists solely to avoid breaking existing userspace
binaries that write to the tunable.

Add a comment in the only possible location within the kernel tree to
describe the situation and motivation for keeping it around.

Signed-off-by: David Rientjes <[email protected]>
---
fs/proc/base.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1032,6 +1032,16 @@ static ssize_t oom_adj_read(struct file *file, char __user *buf, size_t count,
return simple_read_from_buffer(buf, count, ppos, buffer, len);
}

+/*
+ * /proc/pid/oom_adj exists solely for backwards compatibility with previous
+ * kernels. The effective policy is defined by oom_score_adj, which has a
+ * different scale: oom_adj grew exponentially and oom_score_adj grows linearly.
+ * Values written to oom_adj are simply mapped linearly to oom_score_adj.
+ * Processes that become oom disabled via oom_adj will still be oom disabled
+ * with this implementation.
+ *
+ * oom_adj cannot be removed since existing userspace binaries use it.
+ */
static ssize_t oom_adj_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{


2015-11-05 10:28:45

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mm, oom: add comment for why oom_adj exists

On Wed 04-11-15 12:32:14, David Rientjes wrote:
> /proc/pid/oom_adj exists solely to avoid breaking existing userspace
> binaries that write to the tunable.
>
> Add a comment in the only possible location within the kernel tree to
> describe the situation and motivation for keeping it around.

I am not sure this is really needed but it certainly is not harmful.
If this is a way to suppress any attempts for changes like
http://lkml.kernel.org/r/1f80189385e540c2a5b2747a7a265d8c%40SHMBX01.spreadtrum.com
then it does not explain why those are not desirable.

> Signed-off-by: David Rientjes <[email protected]>
> ---
> fs/proc/base.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1032,6 +1032,16 @@ static ssize_t oom_adj_read(struct file *file, char __user *buf, size_t count,
> return simple_read_from_buffer(buf, count, ppos, buffer, len);
> }
>
> +/*
> + * /proc/pid/oom_adj exists solely for backwards compatibility with previous
> + * kernels. The effective policy is defined by oom_score_adj, which has a
> + * different scale: oom_adj grew exponentially and oom_score_adj grows linearly.
> + * Values written to oom_adj are simply mapped linearly to oom_score_adj.
> + * Processes that become oom disabled via oom_adj will still be oom disabled
> + * with this implementation.
> + *
> + * oom_adj cannot be removed since existing userspace binaries use it.

This is a bit strong wording. I think the knob can be removed in the future.

* oom_adj is kept for compatibility reasons. There are still few
* projects which use oom_adj only. We have tried to convert all of them
* which could be found but it will take some time until all those changes
* bubble up to all users. We might try to remove the knob in few years
* if the situtation changes.

> + */
> static ssize_t oom_adj_write(struct file *file, const char __user *buf,
> size_t count, loff_t *ppos)
> {

--
Michal Hocko
SUSE Labs

2015-11-05 21:28:28

by David Rientjes

[permalink] [raw]
Subject: Re: [patch] mm, oom: add comment for why oom_adj exists

On Thu, 5 Nov 2015, Michal Hocko wrote:

> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1032,6 +1032,16 @@ static ssize_t oom_adj_read(struct file *file, char __user *buf, size_t count,
> > return simple_read_from_buffer(buf, count, ppos, buffer, len);
> > }
> >
> > +/*
> > + * /proc/pid/oom_adj exists solely for backwards compatibility with previous
> > + * kernels. The effective policy is defined by oom_score_adj, which has a
> > + * different scale: oom_adj grew exponentially and oom_score_adj grows linearly.
> > + * Values written to oom_adj are simply mapped linearly to oom_score_adj.
> > + * Processes that become oom disabled via oom_adj will still be oom disabled
> > + * with this implementation.
> > + *
> > + * oom_adj cannot be removed since existing userspace binaries use it.
>
> This is a bit strong wording. I think the knob can be removed in the future.
>

Perhaps you are my optimistic than I am, but I would think it would be
difficult to remove a tunable that requires binaries to be re-built to
avoid. That was Linus's primary objection, IIRC. If an application fails
to oom disable itself because it still writes to oom_adj, the results
could be a system wide failure. There are workarounds to that if you have
root, but I don't think we're in a position to remove it in the near
future. I think the comment is clear why it cannot be removed right now
and its current implementation.

Converting software that writes to oom_adj to use oom_score_adj instead is
still a worthwhile goal, though, since they'd be using the semantics of
the effective policy.

2015-11-06 03:52:25

by Hillf Danton

[permalink] [raw]
Subject: Re: [patch] mm, oom: add comment for why oom_adj exists

>
> /proc/pid/oom_adj exists solely to avoid breaking existing userspace
> binaries that write to the tunable.
>
> Add a comment in the only possible location within the kernel tree to
> describe the situation and motivation for keeping it around.
>
> Signed-off-by: David Rientjes <[email protected]>
> ---

Acked-by: Hillf Danton <[email protected]>

> fs/proc/base.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1032,6 +1032,16 @@ static ssize_t oom_adj_read(struct file *file, char __user *buf, size_t count,
> return simple_read_from_buffer(buf, count, ppos, buffer, len);
> }
>
> +/*
> + * /proc/pid/oom_adj exists solely for backwards compatibility with previous
> + * kernels. The effective policy is defined by oom_score_adj, which has a
> + * different scale: oom_adj grew exponentially and oom_score_adj grows linearly.
> + * Values written to oom_adj are simply mapped linearly to oom_score_adj.
> + * Processes that become oom disabled via oom_adj will still be oom disabled
> + * with this implementation.
> + *
> + * oom_adj cannot be removed since existing userspace binaries use it.
> + */
> static ssize_t oom_adj_write(struct file *file, const char __user *buf,
> size_t count, loff_t *ppos)
> {
> --