2010-06-01 20:36:55

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On Mon, 31 May 2010, KOSAKI Motohiro wrote:

> From: Oleg Nesterov <[email protected]>
> Subject: oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads
>
> select_bad_process() thinks a kernel thread can't have ->mm != NULL, this
> is not true due to use_mm().
>
> Change the code to check PF_KTHREAD.
>
> Signed-off-by: Oleg Nesterov <[email protected]>
> Acked-by: David Rientjes <[email protected]>
> Signed-off-by: KOSAKI Motohiro <[email protected]>

This is already pushed in my oom killer rewrite as patch 14/18 "check
PF_KTHREAD instead of !mm to skip kthreads".

This does not need to be merged immediately since it's not vital: use_mm()
is only temporary state and these kthreads will once again be excluded
when they call unuse_mm(). The worst case scenario here is that the oom
killer will erroneously select one of these kthreads which cannot die and
will need to reselect another task on its next call.


2010-06-01 21:21:55

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On 06/01, David Rientjes wrote:
>
> On Mon, 31 May 2010, KOSAKI Motohiro wrote:
>
> > select_bad_process() thinks a kernel thread can't have ->mm != NULL, this
> > is not true due to use_mm().
> >
> > Change the code to check PF_KTHREAD.
>
> This is already pushed in my oom killer rewrite as patch 14/18 "check
> PF_KTHREAD instead of !mm to skip kthreads".
>
> This does not need to be merged immediately since it's not vital: use_mm()
> is only temporary state and these kthreads will once again be excluded
> when they call unuse_mm(). The worst case scenario here is that the oom
> killer will erroneously select one of these kthreads which cannot die

It can't die but force_sig() does bad things which shouldn't be done
with workqueue thread. Note that it removes SIG_IGN, sets
SIGNAL_GROUP_EXIT, makes signal_pending/fatal_signal_pedning true, etc.

But yes, I agree, the problem is minor. But nevertheless it is bug,
the longstanding bug with the simple fix. Why should we "hide" this fix
inside the long series of non-trivial patches which rewrite oom-killer?
And it is completely orthogonal to other changes.

Oleg.

2010-06-01 21:27:08

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On Tue, 1 Jun 2010, Oleg Nesterov wrote:

> But yes, I agree, the problem is minor. But nevertheless it is bug,
> the longstanding bug with the simple fix. Why should we "hide" this fix
> inside the long series of non-trivial patches which rewrite oom-killer?
> And it is completely orthogonal to other changes.
>

Again, the question is whether or not the fix is rc material or not,
otherwise there's no difference in the route that it gets upstream: the
patch is duplicated in both series. If you feel that this minor issue
(which has never been reported in at least the last three years and
doesn't have any side effects other than a couple of millisecond delay
until unuse_mm() when the oom killer will kill something else) should be
addressed in 2.6.35-rc2, then that's a conversation to be had with Andrew.

2010-06-02 13:54:45

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

> On Tue, 1 Jun 2010, Oleg Nesterov wrote:
>
> > But yes, I agree, the problem is minor. But nevertheless it is bug,
> > the longstanding bug with the simple fix. Why should we "hide" this fix
> > inside the long series of non-trivial patches which rewrite oom-killer?
> > And it is completely orthogonal to other changes.
> >
>
> Again, the question is whether or not the fix is rc material or not,
> otherwise there's no difference in the route that it gets upstream: the
> patch is duplicated in both series. If you feel that this minor issue
> (which has never been reported in at least the last three years and
> doesn't have any side effects other than a couple of millisecond delay
> until unuse_mm() when the oom killer will kill something else) should be
> addressed in 2.6.35-rc2, then that's a conversation to be had with Andrew.

Well, we have bugfix-at-first development rule. Why do you refuse our
development process?


2010-06-02 21:09:29

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On Wed, 2 Jun 2010, KOSAKI Motohiro wrote:

> > Again, the question is whether or not the fix is rc material or not,
> > otherwise there's no difference in the route that it gets upstream: the
> > patch is duplicated in both series. If you feel that this minor issue
> > (which has never been reported in at least the last three years and
> > doesn't have any side effects other than a couple of millisecond delay
> > until unuse_mm() when the oom killer will kill something else) should be
> > addressed in 2.6.35-rc2, then that's a conversation to be had with Andrew.
>
> Well, we have bugfix-at-first development rule. Why do you refuse our
> development process?
>

This isn't a bugfix, it simply prevents a recall to the oom killer after
the kthread has called unuse_mm(). Please show where any side effects of
oom killing a kthread, which cannot exit, as a result of use_mm() causes a
problem _anywhere_.

If that's the definition you have for a "bugfix," then I could certainly
argue that some of my patches like "oom: filter tasks not sharing the same
cpuset" is a bugfix because it allows needlessly killing tasks that won't
free memory for current, or "oom: avoid oom killer for lowmem allocations"
is a bugfix because it allows killing a task that won't free lowmem, etc.

I agree that this is a nice patch to have to avoid that recall later,
which is why I merged it into my patchset, but let's please be accurate
about its impact.

2010-06-02 21:35:11

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On 06/02, David Rientjes wrote:
>
> On Wed, 2 Jun 2010, KOSAKI Motohiro wrote:
>
> > > Again, the question is whether or not the fix is rc material or not,
> > > otherwise there's no difference in the route that it gets upstream: the
> > > patch is duplicated in both series. If you feel that this minor issue
> > > (which has never been reported in at least the last three years and
> > > doesn't have any side effects other than a couple of millisecond delay
> > > until unuse_mm() when the oom killer will kill something else) should be
> > > addressed in 2.6.35-rc2, then that's a conversation to be had with Andrew.
> >
> > Well, we have bugfix-at-first development rule. Why do you refuse our
> > development process?
>
> This isn't a bugfix, it simply prevents a recall to the oom killer after
> the kthread has called unuse_mm(). Please show where any side effects of
> oom killing a kthread, which cannot exit, as a result of use_mm() causes a
> problem _anywhere_.

I already showed you the side effects, but you removed this part in your
reply.

>From http://marc.info/?l=linux-kernel&m=127542732121077

It can't die but force_sig() does bad things which shouldn't be done
with workqueue thread. Note that it removes SIG_IGN, sets
SIGNAL_GROUP_EXIT, makes signal_pending/fatal_signal_pedning true, etc.

A workqueue thread must not run with SIGNAL_GROUP_EXIT set, SIGKILL
must be ignored, signal_pending() must not be true.

This is bug. It is minor, agreed, currently use_mm() is only used by aio.

Oleg.

2010-06-02 21:46:41

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On Wed, 2 Jun 2010, Oleg Nesterov wrote:

> > This isn't a bugfix, it simply prevents a recall to the oom killer after
> > the kthread has called unuse_mm(). Please show where any side effects of
> > oom killing a kthread, which cannot exit, as a result of use_mm() causes a
> > problem _anywhere_.
>
> I already showed you the side effects, but you removed this part in your
> reply.
>
> From http://marc.info/?l=linux-kernel&m=127542732121077
>
> It can't die but force_sig() does bad things which shouldn't be done
> with workqueue thread. Note that it removes SIG_IGN, sets
> SIGNAL_GROUP_EXIT, makes signal_pending/fatal_signal_pedning true, etc.
>
> A workqueue thread must not run with SIGNAL_GROUP_EXIT set, SIGKILL
> must be ignored, signal_pending() must not be true.
>
> This is bug. It is minor, agreed, currently use_mm() is only used by aio.
>

It's a problem that would probably never happen in practice because you're
talking about a race between select_bad_process() and __oom_kill_task()
which is wide since it iterates the entire tasklist, which workqueue
threads will be near the beginning of, and there is an extremely small
chance that the badness score for the mm that it assumed would be
considered the ideal task to kill. If you think this is rc material, then
push it to Andrew and say that.

2010-06-03 14:28:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On 06/02, David Rientjes wrote:
>
> On Wed, 2 Jun 2010, Oleg Nesterov wrote:
>
> > > This isn't a bugfix, it simply prevents a recall to the oom killer after
> > > the kthread has called unuse_mm(). Please show where any side effects of
> > > oom killing a kthread, which cannot exit, as a result of use_mm() causes a
> > > problem _anywhere_.
> >
> > I already showed you the side effects, but you removed this part in your
> > reply.
> >
> > From http://marc.info/?l=linux-kernel&m=127542732121077
> >
> > It can't die but force_sig() does bad things which shouldn't be done
> > with workqueue thread. Note that it removes SIG_IGN, sets
> > SIGNAL_GROUP_EXIT, makes signal_pending/fatal_signal_pedning true, etc.
> >
> > A workqueue thread must not run with SIGNAL_GROUP_EXIT set, SIGKILL
> > must be ignored, signal_pending() must not be true.
> >
> > This is bug. It is minor, agreed, currently use_mm() is only used by aio.
>
> It's a problem that would probably never happen in practice because

No need to convince me this bug is minor. I repeated this every time.
I only argued with the "isn't a bugfix, no side effects".

> considered the ideal task to kill. If you think this is rc material, then
> push it to Andrew and say that.

No, I don't think it is strictly necessary to push this fix into rc.
But I don't understand why this matters. And in any case, when it comes
to oom, I am in no position to make any authoritative decisions.


David, I don't understand why do you refuse to re-diff your changes
on top of Kosaki's work. If nothing else, this will help to review
your changes.

Oleg.

2010-06-03 20:11:34

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads

On Thu, 3 Jun 2010, Oleg Nesterov wrote:

> David, I don't understand why do you refuse to re-diff your changes
> on top of Kosaki's work. If nothing else, this will help to review
> your changes.
>

I simply don't have enough time in a day to rebase my rewrite patches on
top of what Andrew may or may not merge into -mm. When he merges
something, that would be different.

I don't think we need to push anything to -mm right now that isn't rc
material since the rewrite should make it in before the merge window. If
there are outstanding fixes that should go into rc (and probably stable
material as well), those need to be pushed to Andrew immediately. I
disagree that I've seen any to date that are immediate fixes.