LinuxLists.cc - Re: [resend][PATCH 2/4] Revert "oom: deprecate oom

2010-11-01 07:06:49

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> On Tue, 26 Oct 2010, KOSAKI Motohiro wrote:
>
> > > NACK as a logical follow-up to my NACK for "oom: remove totalpage
> > > normalization from oom_badness()"
> >
> > Huh?
> >
> > I requested you show us justification. BUT YOU DIDNT. If you have any
> > usecase, show us RIGHT NOW.
> >
>
> The new tunable added in 2.6.36, /proc/pid/oom_score_adj, is necessary for
> the units that the badness score now uses. We need a tunable with a much

Who we?

> higher resolution than the oom_adj scale from -16 to +15, and one that
> scales linearly as opposed to exponentially. Since that tunable is much
> more powerful than the oom_adj implementation, which never made any real

The reason that you ware NAKed was not to introduce new powerful feature.
It was caused to break old and used feature from applications.

> sense for defining oom killing priority for any purpose other than
> polarization, the old tunable is deprecated for two years.

You haven't tested your patch at all. Distro's initram script are using
oom_adj interface and latest kernel show pointless warnings
"/proc/xx/oom_adj is deprecated, please use /proc/xx/oom_score_adj instead."
at _every_ boot time.

As I said, DON'T SEND UNTESTED PATCH! DON'T BREAK USERLAND CARELESSLY!

2010-11-01 19:36:29

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Mon, 1 Nov 2010, KOSAKI Motohiro wrote:

> > The new tunable added in 2.6.36, /proc/pid/oom_score_adj, is necessary for
> > the units that the badness score now uses. We need a tunable with a much
>
> Who we?
>

Linux users who care about prioritizing tasks for oom kill with a tunable
that (1) has a unit, (2) has a higher resolution, and (3) is linear and
not exponential. Memcg doesn't solve this issue without incurring a 1%
memory cost.

> > higher resolution than the oom_adj scale from -16 to +15, and one that
> > scales linearly as opposed to exponentially. Since that tunable is much
> > more powerful than the oom_adj implementation, which never made any real
>
> The reason that you ware NAKed was not to introduce new powerful feature.
> It was caused to break old and used feature from applications.
>

No, it doesn't, and you completely and utterly failed to show a single
usecase that broke as a result of this because nobody can currently use
oom_adj for anything other than polarization. Thus, there's no backwards
compatibility issue.

> > sense for defining oom killing priority for any purpose other than
> > polarization, the old tunable is deprecated for two years.
>
> You haven't tested your patch at all. Distro's initram script are using
> oom_adj interface and latest kernel show pointless warnings
> "/proc/xx/oom_adj is deprecated, please use /proc/xx/oom_score_adj instead."
> at _every_ boot time.
>

Yes, I've tested it, and it deprecates the tunable as expected. A single
warning message serves the purpose well: let users know one time without
being overly verbose that the tunable is deprecated and give them
sufficient time (2 years) to start using the new tunable. That's how
deprecation is done.

2010-11-09 02:26:33

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> On Mon, 1 Nov 2010, KOSAKI Motohiro wrote:
>
> > > The new tunable added in 2.6.36, /proc/pid/oom_score_adj, is necessary for
> > > the units that the badness score now uses. We need a tunable with a much
> >
> > Who we?
> >
>
> Linux users who care about prioritizing tasks for oom kill with a tunable
> that (1) has a unit, (2) has a higher resolution, and (3) is linear and
> not exponential.

No. Majority user don't care. You only talk about your case. Don't ignore
end user.

> Memcg doesn't solve this issue without incurring a 1%
> memory cost.

Look at a real.
All major distributions has already turn on memcg. End user don't need
to pay additional cost.

>
> > > higher resolution than the oom_adj scale from -16 to +15, and one that
> > > scales linearly as opposed to exponentially. Since that tunable is much
> > > more powerful than the oom_adj implementation, which never made any real
> >
> > The reason that you ware NAKed was not to introduce new powerful feature.
> > It was caused to break old and used feature from applications.
> >
>
> No, it doesn't, and you completely and utterly failed to show a single
> usecase that broke as a result of this because nobody can currently use
> oom_adj for anything other than polarization. Thus, there's no backwards
> compatibility issue.

No. I showed.
1) Google code search showed some application are using this feature.
http://www.google.com/codesearch?as_q=oom_adj&btnG=Search+Code&hl=ja&as_package=&as_lang=&as_filename=&as_class=&as_function=&as_license=&as_case=

2) Not body use oom_adj other than polarization even though there are a few.
example, kde are using.
http://www.google.com/codesearch/p?hl=ja#MPJuLvSvNYM/pub/kde/unstable/snapshots/kdelibs.tar.bz2%7CWClmGVN5niU/kdelibs-1164923/kinit/start_kdeinit.c&q=oom_adj%20kde%205

When you are talking polarization issue, you blind a real. Don't talk your dream.

3) udev are using this feature. It's one of major linux component and you broke.

http://www.google.com/codesearch/p?hl=ja#KVTjzuVpblQ/pub/linux/utils/kernel/hotplug/udev-072.tar.bz2%7CwUSE-Ay3lLI/udev-072/udevd.c&q=oom_adj

You don't have to break our userland. you can't rewrite or deprecate
old one. It's used! You can only add orthogonal new knob.

> > > sense for defining oom killing priority for any purpose other than
> > > polarization, the old tunable is deprecated for two years.
> >
> > You haven't tested your patch at all. Distro's initram script are using
> > oom_adj interface and latest kernel show pointless warnings
> > "/proc/xx/oom_adj is deprecated, please use /proc/xx/oom_score_adj instead."
> > at _every_ boot time.
> >
>
> Yes, I've tested it, and it deprecates the tunable as expected. A single
> warning message serves the purpose well: let users know one time without
> being overly verbose that the tunable is deprecated and give them
> sufficient time (2 years) to start using the new tunable. That's how
> deprecation is done.

no sense.

Why do their application need to rewrite for *YOU*? Okey, you will got
benefit from your new knob. But NOBDOY use the new one. and People need
to rewrite their application even though no benefit.

Don't do selfish userland breakage!

2010-11-09 03:28:14

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> > Yes, I've tested it, and it deprecates the tunable as expected. A single
> > warning message serves the purpose well: let users know one time without
> > being overly verbose that the tunable is deprecated and give them
> > sufficient time (2 years) to start using the new tunable. That's how
> > deprecation is done.
>
> no sense.
>
> Why do their application need to rewrite for *YOU*? Okey, you will got
> benefit from your new knob. But NOBDOY use the new one. and People need
> to rewrite their application even though no benefit.
>
> Don't do selfish userland breakage!

And you said you ignore bug even though you have seen it. It suck!

2010-11-09 23:33:18

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Tue, 9 Nov 2010, KOSAKI Motohiro wrote:

> > > > The new tunable added in 2.6.36, /proc/pid/oom_score_adj, is necessary for
> > > > the units that the badness score now uses. We need a tunable with a much
> > >
> > > Who we?
> > >
> >
> > Linux users who care about prioritizing tasks for oom kill with a tunable
> > that (1) has a unit, (2) has a higher resolution, and (3) is linear and
> > not exponential.
>
> No. Majority user don't care. You only talk about your case. Don't ignore
> end user.
>

If they don't care, then they won't be using oom_adj, so you're point
about it's deprecation is irrelevant.

Other users do want a more powerful userspace interface with a unit and
higher resolution (I am one of them), there's no requirement that those
users need to be in the majority.

> > Memcg doesn't solve this issue without incurring a 1%
> > memory cost.
>
> Look at a real.
> All major distributions has already turn on memcg. End user don't need
> to pay additional cost.
>

Memcg also has a command-line disabling option to avoid incurring this 1%
memory cost when you're not going to be using it.

> > No, it doesn't, and you completely and utterly failed to show a single
> > usecase that broke as a result of this because nobody can currently use
> > oom_adj for anything other than polarization. Thus, there's no backwards
> > compatibility issue.
>
> No. I showed.
> 1) Google code search showed some application are using this feature.
> http://www.google.com/codesearch?as_q=oom_adj&btnG=Search+Code&hl=ja&as_package=&as_lang=&as_filename=&as_class=&as_function=&as_license=&as_case=
>

oom_adj isn't removed, it's deprecated. These users are using a
deprecated interface and have a few years to convert to using the new
interface (if it ever is actually removed).

> 2) Not body use oom_adj other than polarization even though there are a few.
> example, kde are using.
> http://www.google.com/codesearch/p?hl=ja#MPJuLvSvNYM/pub/kde/unstable/snapshots/kdelibs.tar.bz2%7CWClmGVN5niU/kdelibs-1164923/kinit/start_kdeinit.c&q=oom_adj%20kde%205
>
> When you are talking polarization issue, you blind a real. Don't talk your dream.
>

I don't understand what you're trying to say here, but the current users
of oom_adj that aren't +15 or -16 (or OOM_DISABLE) are arbitrary based
relative to other tasks such as +5, +10, etc. They don't have any
semantics other than being arbitrarily relative because it doesn't work in
a linear way or with a scale.

> 3) udev are using this feature. It's one of major linux component and you broke.
>
> http://www.google.com/codesearch/p?hl=ja#KVTjzuVpblQ/pub/linux/utils/kernel/hotplug/udev-072.tar.bz2%7CwUSE-Ay3lLI/udev-072/udevd.c&q=oom_adj
>
> You don't have to break our userland. you can't rewrite or deprecate
> old one. It's used! You can only add orthogonal new knob.
>

That's incorrect, I didn't break anything by deprecating a tunable for a
few years. oom_adj gets converted roughly into an equivalent (but linear)
oom_score_adj.

Unfortunately for your argument, you can't show a single example of a
current oom_adj user that has a scientific calculation behind its value
that is now broken on the linear scale.

> > Yes, I've tested it, and it deprecates the tunable as expected. A single
> > warning message serves the purpose well: let users know one time without
> > being overly verbose that the tunable is deprecated and give them
> > sufficient time (2 years) to start using the new tunable. That's how
> > deprecation is done.
>
> no sense.
>
> Why do their application need to rewrite for *YOU*? Okey, you will got
> benefit from your new knob. But NOBDOY use the new one. and People need
> to rewrite their application even though no benefit.
>
> Don't do selfish userland breakage!
>

It's deprecated for a few years so users can gradually convert to the new
tunable, it wasn't removed when the new one was introduced. A higher
resolution tunable that scales linearly with a unit is an advantage for
Linux (for the minority of users who care about oom killing priority
beyond the heuristic) and I think a few years is enough time for users to
do a simple conversion to the new tunable.

2010-11-09 23:37:30

by Alan

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> It's deprecated for a few years so users can gradually convert to the new
> tunable, it wasn't removed when the new one was introduced. A higher
> resolution tunable that scales linearly with a unit is an advantage for
> Linux (for the minority of users who care about oom killing priority
> beyond the heuristic) and I think a few years is enough time for users to
> do a simple conversion to the new tunable.

Documentation/ABI/obsolete/

should have all obsoletes in it.

Alan

2010-11-09 23:48:25

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Tue, 9 Nov 2010, Alan Cox wrote:

> > It's deprecated for a few years so users can gradually convert to the new
> > tunable, it wasn't removed when the new one was introduced. A higher
> > resolution tunable that scales linearly with a unit is an advantage for
> > Linux (for the minority of users who care about oom killing priority
> > beyond the heuristic) and I think a few years is enough time for users to
> > do a simple conversion to the new tunable.
>
> Documentation/ABI/obsolete/
>
> should have all obsoletes in it.
>

Good point, the only documentation right now is in
Documentation/feature-removal-schedule.txt and in the kernel log the first
time oom_adj is written. I'll generate a patch, thanks!

2010-11-09 23:56:05

by David Rientjes

[permalink] [raw]

Subject: [patch] oom: document obsolete oom_adj tunable

/proc/pid/oom_adj was deprecated in August 2010 with the introduction of
the new oom killer heuristic.

This patch copies the Documentation/feature-removal-schedule.txt entry
for this tunable to the Documentation/ABI/obsolete directory so nobody
misses it.

Reported-by: Alan Cox <[email protected]>
Signed-off-by: David Rientjes <[email protected]>
---
Documentation/ABI/obsolete/proc-pid-oom_adj | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)
create mode 100644 Documentation/ABI/obsolete/proc-pid-oom_adj

diff --git a/Documentation/ABI/obsolete/proc-pid-oom_adj b/Documentation/ABI/obsolete/proc-pid-oom_adj
new file mode 100644
--- /dev/null
+++ b/Documentation/ABI/obsolete/proc-pid-oom_adj
@@ -0,0 +1,22 @@
+What: /proc/<pid>/oom_adj
+When: August 2012
+Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's
+ badness heuristic used to determine which task to kill when the kernel
+ is out of memory.
+
+ The badness heuristic has since been rewritten since the introduction of
+ this tunable such that its meaning is deprecated. The value was
+ implemented as a bitshift on a score generated by the badness()
+ function that did not have any precise units of measure. With the
+ rewrite, the score is given as a proportion of available memory to the
+ task allocating pages, so using a bitshift which grows the score
+ exponentially is, thus, impossible to tune with fine granularity.
+
+ A much more powerful interface, /proc/<pid>/oom_score_adj, was
+ introduced with the oom killer rewrite that allows users to increase or
+ decrease the badness() score linearly. This interface will replace
+ /proc/<pid>/oom_adj.
+
+ A warning will be emitted to the kernel log if an application uses this
+ deprecated interface. After it is printed once, future warnings will be
+ suppressed until the kernel is rebooted.

2010-11-14 05:07:32

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> On Tue, 9 Nov 2010, KOSAKI Motohiro wrote:
>
> > > > > The new tunable added in 2.6.36, /proc/pid/oom_score_adj, is necessary for
> > > > > the units that the badness score now uses. We need a tunable with a much
> > > >
> > > > Who we?
> > > >
> > >
> > > Linux users who care about prioritizing tasks for oom kill with a tunable
> > > that (1) has a unit, (2) has a higher resolution, and (3) is linear and
> > > not exponential.
> >
> > No. Majority user don't care. You only talk about your case. Don't ignore
> > end user.
>
> If they don't care, then they won't be using oom_adj, so you're point
> about it's deprecation is irrelevant.

No irrelevant. Your patch break their environment even though
they don't use oom_adj explicitly. because their application are using it.

>
> Other users do want a more powerful userspace interface with a unit and
> higher resolution (I am one of them), there's no requirement that those
> users need to be in the majority.

But, they only live in your DREAM. you coldn't show who necessary.

>
> > > Memcg doesn't solve this issue without incurring a 1%
> > > memory cost.
> >
> > Look at a real.
> > All major distributions has already turn on memcg. End user don't need
> > to pay additional cost.
>
> Memcg also has a command-line disabling option to avoid incurring this 1%
> memory cost when you're not going to be using it.

Look at real. who use it?

> > > No, it doesn't, and you completely and utterly failed to show a single
> > > usecase that broke as a result of this because nobody can currently use
> > > oom_adj for anything other than polarization. Thus, there's no backwards
> > > compatibility issue.
> >
> > No. I showed.
> > 1) Google code search showed some application are using this feature.
> > http://www.google.com/codesearch?as_q=oom_adj&btnG=Search+Code&hl=ja&as_package=&as_lang=&as_filename=&as_class=&as_function=&as_license=&as_case=
> >
>
> oom_adj isn't removed, it's deprecated. These users are using a
> deprecated interface and have a few years to convert to using the new
> interface (if it ever is actually removed).

No. there is no reason to enforce rewrite tons applicatin.

>
> > 2) Not body use oom_adj other than polarization even though there are a few.
> > example, kde are using.
> > http://www.google.com/codesearch/p?hl=ja#MPJuLvSvNYM/pub/kde/unstable/snapshots/kdelibs.tar.bz2%7CWClmGVN5niU/kdelibs-1164923/kinit/start_kdeinit.c&q=oom_adj%20kde%205
> >
> > When you are talking polarization issue, you blind a real. Don't talk your dream.
> >
>
> I don't understand what you're trying to say here, but the current users
> of oom_adj that aren't +15 or -16 (or OOM_DISABLE) are arbitrary based
> relative to other tasks such as +5, +10, etc. They don't have any
> semantics other than being arbitrarily relative because it doesn't work in
> a linear way or with a scale.

Even if you don't understand, they are IN THE WORLD. you don't have to
ignore a real.

> > 3) udev are using this feature. It's one of major linux component and you broke.
> >
> > http://www.google.com/codesearch/p?hl=ja#KVTjzuVpblQ/pub/linux/utils/kernel/hotplug/udev-072.tar.bz2%7CwUSE-Ay3lLI/udev-072/udevd.c&q=oom_adj
> >
> > You don't have to break our userland. you can't rewrite or deprecate
> > old one. It's used! You can only add orthogonal new knob.
> >
>
> That's incorrect, I didn't break anything by deprecating a tunable for a
> few years. oom_adj gets converted roughly into an equivalent (but linear)
> oom_score_adj.
>
> Unfortunately for your argument, you can't show a single example of a
> current oom_adj user that has a scientific calculation behind its value
> that is now broken on the linear scale.

you are talking unrelated thing.

>
> > > Yes, I've tested it, and it deprecates the tunable as expected. A single
> > > warning message serves the purpose well: let users know one time without
> > > being overly verbose that the tunable is deprecated and give them
> > > sufficient time (2 years) to start using the new tunable. That's how
> > > deprecation is done.
> >
> > no sense.
> >
> > Why do their application need to rewrite for *YOU*? Okey, you will got
> > benefit from your new knob. But NOBDOY use the new one. and People need
> > to rewrite their application even though no benefit.
> >
> > Don't do selfish userland breakage!
> >
>
> It's deprecated for a few years so users can gradually convert to the new
> tunable, it wasn't removed when the new one was introduced. A higher
> resolution tunable that scales linearly with a unit is an advantage for
> Linux (for the minority of users who care about oom killing priority
> beyond the heuristic) and I think a few years is enough time for users to
> do a simple conversion to the new tunable.

no sense.

2010-11-14 21:39:35

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Sun, 14 Nov 2010, KOSAKI Motohiro wrote:

> No irrelevant. Your patch break their environment even though
> they don't use oom_adj explicitly. because their application are using it.
>

The _only_ difference too oom_adj since the rewrite is that it is now
mapped on a linear scale rather than an exponential scale. That's because
the heuristic itself has a defined range [0, 1000] that characterizes the
memory usage of the application it is ranking. To show any breakge, you
would have to show how oom_adj values being used by applications are based
on a calculated value that prioritizes those tasks amongst each other.
With the exponential scale, that's nearly impossible because of the number
of arbitrary heuristics that were used before oom_adj were considered
(runtime, nice level, CAP_SYS_RAWIO, etc).

So don't talk about userspace breakage when you can't even describe it or
present a single usecase.

2010-11-15 00:22:32

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [patch] oom: document obsolete oom_adj tunable

> /proc/pid/oom_adj was deprecated in August 2010 with the introduction of
> the new oom killer heuristic.
>
> This patch copies the Documentation/feature-removal-schedule.txt entry
> for this tunable to the Documentation/ABI/obsolete directory so nobody
> misses it.
>
> Reported-by: Alan Cox <[email protected]>
> Signed-off-by: David Rientjes <[email protected]>

NAK. You seems to think shouting claim makes some effect. but It's incorrect.
Your childish shout doesn't solve any real world issue. Only code fix does.

> ---
> Documentation/ABI/obsolete/proc-pid-oom_adj | 22 ++++++++++++++++++++++
> 1 files changed, 22 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/ABI/obsolete/proc-pid-oom_adj
>
> diff --git a/Documentation/ABI/obsolete/proc-pid-oom_adj b/Documentation/ABI/obsolete/proc-pid-oom_adj
> new file mode 100644
> --- /dev/null
> +++ b/Documentation/ABI/obsolete/proc-pid-oom_adj
> @@ -0,0 +1,22 @@
> +What: /proc/<pid>/oom_adj
> +When: August 2012
> +Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's
> + badness heuristic used to determine which task to kill when the kernel
> + is out of memory.
> +
> + The badness heuristic has since been rewritten since the introduction of
> + this tunable such that its meaning is deprecated. The value was
> + implemented as a bitshift on a score generated by the badness()
> + function that did not have any precise units of measure. With the
> + rewrite, the score is given as a proportion of available memory to the
> + task allocating pages, so using a bitshift which grows the score
> + exponentially is, thus, impossible to tune with fine granularity.
> +
> + A much more powerful interface, /proc/<pid>/oom_score_adj, was
> + introduced with the oom killer rewrite that allows users to increase or
> + decrease the badness() score linearly. This interface will replace
> + /proc/<pid>/oom_adj.

Incorrect. oom_adj and oom_score_adj have different concept and different abstraction.
One can't replace another.

> +
> + A warning will be emitted to the kernel log if an application uses this
> + deprecated interface. After it is printed once, future warnings will be
> + suppressed until the kernel is rebooted.

2010-11-15 00:24:25

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> > > Yes, I've tested it, and it deprecates the tunable as expected. A single
> > > warning message serves the purpose well: let users know one time without
> > > being overly verbose that the tunable is deprecated and give them
> > > sufficient time (2 years) to start using the new tunable. That's how
> > > deprecation is done.
> >
> > no sense.
> >
> > Why do their application need to rewrite for *YOU*? Okey, you will got
> > benefit from your new knob. But NOBDOY use the new one. and People need
> > to rewrite their application even though no benefit.
> >
> > Don't do selfish userland breakage!
>
> And you said you ignore bug even though you have seen it. It suck!

At v2.6.36-rc1, oom-killer doesn't work at all because YOU BROKE.
And I was working on fixing it.

2010-08-19
http://marc.info/?t=128223176900001&r=1&w=2
http://marc.info/?t=128221532700003&r=1&w=2
http://marc.info/?t=128221532500008&r=1&w=2

However, You submitted new crap before the fixing.

2010-08-15
http://marc.info/?t=128184669600001&r=1&w=2

If you tested mainline a bit, you could find the problem quickly.
You should have fixed mainline kernel at first.

Again, YOU HAVEN'T TESTED YOUR OWN PATCH AT ALL.

2010-11-15 09:59:23

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Mon, 15 Nov 2010, KOSAKI Motohiro wrote:

> At v2.6.36-rc1, oom-killer doesn't work at all because YOU BROKE.
> And I was working on fixing it.
>
> 2010-08-19
> http://marc.info/?t=128223176900001&r=1&w=2

This existed before my oom killer rewrite, it was only noticed because the
rewrite enabled oom_dump_tasks by default.

> http://marc.info/?t=128221532700003&r=1&w=2

Yes, tasklist_lock was dropped in a mismerge of my patches when posting
them. Thanks for finding it and posting a patch, I appreciate it.

> http://marc.info/?t=128221532500008&r=1&w=2
>

Yes, if a task was racing between oom_kill_process() and oom_kill_task()
and all threads had dropped its mm between calls then there was a NULL
pointer dereference, thanks for fixing that as well.

> However, You submitted new crap before the fixing.
>
> 2010-08-15
> http://marc.info/?t=128184669600001&r=1&w=2
>

This isn't "crap", this is a necessary bit to ensure that tasks that share
an ->mm with a task immune from kill aren't killed themselves since we
can't free the memory. We came to the consensus that it would be better
to count the tasks that are OOM_DISABLE in the mm_struct to avoid the
O(2*n) tasklist scan.

> If you tested mainline a bit, you could find the problem quickly.
> You should have fixed mainline kernel at first.
>

Thanks for finding a couple fixes during the 2.6.36-rc1 when the rewrite
was first merged, it's much appreciated!

2010-11-15 10:38:15

by David Rientjes

[permalink] [raw]

Subject: Re: [patch] oom: document obsolete oom_adj tunable

On Mon, 15 Nov 2010, KOSAKI Motohiro wrote:

> > /proc/pid/oom_adj was deprecated in August 2010 with the introduction of
> > the new oom killer heuristic.
> >
> > This patch copies the Documentation/feature-removal-schedule.txt entry
> > for this tunable to the Documentation/ABI/obsolete directory so nobody
> > misses it.
> >
> > Reported-by: Alan Cox <[email protected]>
> > Signed-off-by: David Rientjes <[email protected]>
>
> NAK. You seems to think shouting claim makes some effect. but It's incorrect.
> Your childish shout doesn't solve any real world issue. Only code fix does.
>

The tunable is deprecated. If you are really that concerned about the
existing users who you don't think can convert in the next two years, why
don't you help them convert? That fixes the issue, but you're not
interested in that. I offered to convert any open-source users you can
list (the hardest part of the conversion is finding who to send patches to
:). You're only interested in continuing to assert your position as
correct even when the kernel is obviously moving in a different direction.

Others may have a different opinion of who is being childish in this whole
ordeal.

2010-11-23 07:17:21

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> On Sun, 14 Nov 2010, KOSAKI Motohiro wrote:
>
> > No irrelevant. Your patch break their environment even though
> > they don't use oom_adj explicitly. because their application are using it.
> >
>
> The _only_ difference too oom_adj since the rewrite is that it is now
> mapped on a linear scale rather than an exponential scale.

_only_ mean don't ZERO different. Why do userland application need to rewrite?

> That's because
> the heuristic itself has a defined range [0, 1000] that characterizes the
> memory usage of the application it is ranking. To show any breakge, you
> would have to show how oom_adj values being used by applications are based
> on a calculated value that prioritizes those tasks amongst each other.
> With the exponential scale, that's nearly impossible because of the number
> of arbitrary heuristics that were used before oom_adj were considered
> (runtime, nice level, CAP_SYS_RAWIO, etc).

But, No people have agreed your powerfulness even though you talked about
the same explanation a lot of times.

Again, IF you need to [0 .. 1000] range, you can calculate it by your
application. current oom score can be get from /proc/pid/oom_score and
total memory can be get from /proc/meminfo. You shouldn't have break
anything.

> So don't talk about userspace breakage when you can't even describe it or
> present a single usecase.

Huh? Remember! your feature have ZERO user.

2010-11-23 07:17:13

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [patch] oom: document obsolete oom_adj tunable

> On Mon, 15 Nov 2010, KOSAKI Motohiro wrote:
>
> > > /proc/pid/oom_adj was deprecated in August 2010 with the introduction of
> > > the new oom killer heuristic.
> > >
> > > This patch copies the Documentation/feature-removal-schedule.txt entry
> > > for this tunable to the Documentation/ABI/obsolete directory so nobody
> > > misses it.
> > >
> > > Reported-by: Alan Cox <[email protected]>
> > > Signed-off-by: David Rientjes <[email protected]>
> >
> > NAK. You seems to think shouting claim makes some effect. but It's incorrect.
> > Your childish shout doesn't solve any real world issue. Only code fix does.
> >
>
> The tunable is deprecated. If you are really that concerned about the
> existing users who you don't think can convert in the next two years, why
> don't you help them convert? That fixes the issue, but you're not
> interested in that. I offered to convert any open-source users you can
> list (the hardest part of the conversion is finding who to send patches to
> :). You're only interested in continuing to assert your position as
> correct even when the kernel is obviously moving in a different direction.

Why don't you change by _your_ hand?

_Usually_ userland software changed at first _by_ who wanted the change.
Example, we fujitsu changed elf core file format when vma are >65536, but
It was not made any breakage. we changed gdb, binutils, elfutils and etc etc
_at_ first.

>
> Others may have a different opinion of who is being childish in this whole
> ordeal.

2010-11-28 01:42:10

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Tue, 23 Nov 2010, KOSAKI Motohiro wrote:

> > > No irrelevant. Your patch break their environment even though
> > > they don't use oom_adj explicitly. because their application are using it.
> > >
> >
> > The _only_ difference too oom_adj since the rewrite is that it is now
> > mapped on a linear scale rather than an exponential scale.
>
> _only_ mean don't ZERO different. Why do userland application need to rewrite?
>

Because NOTHING breaks with the new mapping. Eight months later since
this was initially proposed on linux-mm, you still cannot show a single
example that depended on the exponential mapping of oom_adj. I'm not
going to continue responding to your criticism about this point since your
argument is completely and utterly baseless.

> Again, IF you need to [0 .. 1000] range, you can calculate it by your
> application. current oom score can be get from /proc/pid/oom_score and
> total memory can be get from /proc/meminfo. You shouldn't have break
> anything.
>

That would require the userspace tunable to be adjusted anytime a task's
mempolicy changes, its nodemask changes, it's cpuset attachment changes,
its mems change, a memcg limit changes, etc. The only constant is the
task's priority, and the current oom_score_adj implementation preserves
that unless explicitly changed later by the user. I completely understand
that you may not have a use for this.

2010-11-30 13:03:47

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

> On Tue, 23 Nov 2010, KOSAKI Motohiro wrote:
>
> > > > No irrelevant. Your patch break their environment even though
> > > > they don't use oom_adj explicitly. because their application are using it.
> > > >
> > >
> > > The _only_ difference too oom_adj since the rewrite is that it is now
> > > mapped on a linear scale rather than an exponential scale.
> >
> > _only_ mean don't ZERO different. Why do userland application need to rewrite?
> >
>
> Because NOTHING breaks with the new mapping. Eight months later since
> this was initially proposed on linux-mm, you still cannot show a single
> example that depended on the exponential mapping of oom_adj. I'm not
> going to continue responding to your criticism about this point since your
> argument is completely and utterly baseless.

No regression mean no break. Not single nor multiple. see?

>
> > Again, IF you need to [0 .. 1000] range, you can calculate it by your
> > application. current oom score can be get from /proc/pid/oom_score and
> > total memory can be get from /proc/meminfo. You shouldn't have break
> > anything.
> >
>
> That would require the userspace tunable to be adjusted anytime a task's
> mempolicy changes, its nodemask changes, it's cpuset attachment changes,

All situation can be calculated on userland. User process can be know
their bindings.

> its mems change, a memcg limit changes, etc. The only constant is the
> task's priority, and the current oom_score_adj implementation preserves
> that unless explicitly changed later by the user. I completely understand
> that you may not have a use for this.

2010-11-30 20:07:47

by David Rientjes

[permalink] [raw]

Subject: Re: [resend][PATCH 2/4] Revert "oom: deprecate oom_adj tunable"

On Tue, 30 Nov 2010, KOSAKI Motohiro wrote:

> > Because NOTHING breaks with the new mapping. Eight months later since
> > this was initially proposed on linux-mm, you still cannot show a single
> > example that depended on the exponential mapping of oom_adj. I'm not
> > going to continue responding to your criticism about this point since your
> > argument is completely and utterly baseless.
>
> No regression mean no break. Not single nor multiple. see?
>

Nothing breaks. If something did, you could respond to my answer above
and provide a single example of a real-world example that broke as a
result of the new linear mapping.

> All situation can be calculated on userland. User process can be know
> their bindings.
>

Yes, but the proportional priority-based oom_score_adj values allow users
to avoid recalculating and writing that value anytime a mempolicy
attachment changes, its nodemask changes, it moves to another cpuset, its
set of mems changes, its memcg attachment changes, its limit is modiifed,
etc.