2011-04-11 21:59:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/4] remove boost_dying_task_prio()

On Mon, 11 Apr 2011 14:31:18 +0900 (JST)
KOSAKI Motohiro <[email protected]> wrote:

> This is a almost revert commit 93b43fa (oom: give the dying
> task a higher priority).
>
> The commit dramatically improve oom killer logic when fork-bomb
> occur. But, I've found it has nasty corner case. Now cpu cgroup
> has strange default RT runtime. It's 0! That said, if a process
> under cpu cgroup promote RT scheduling class, the process never
> run at all.

hm. How did that happen? I thought that sched_setscheduler() modifies
only a single thread, and that thread is in the process of exiting?

> Eventually, kernel may hang up when oom kill occur.
> I and Luis who original author agreed to disable this logic at
> once.
>
> ...
>
> index 6a819d1..83fb72c1 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -84,24 +84,6 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
> #endif /* CONFIG_NUMA */
>
> /*
> - * If this is a system OOM (not a memcg OOM) and the task selected to be
> - * killed is not already running at high (RT) priorities, speed up the
> - * recovery by boosting the dying task to the lowest FIFO priority.
> - * That helps with the recovery and avoids interfering with RT tasks.
> - */
> -static void boost_dying_task_prio(struct task_struct *p,
> - struct mem_cgroup *mem)
> -{
> - struct sched_param param = { .sched_priority = 1 };
> -
> - if (mem)
> - return;
> -
> - if (!rt_task(p))
> - sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
> -}

I'm rather glad to see that code go away though - SCHED_FIFO is
dangerous...


2011-04-12 00:35:22

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/4] remove boost_dying_task_prio()

Hi

> On Mon, 11 Apr 2011 14:31:18 +0900 (JST)
> KOSAKI Motohiro <[email protected]> wrote:
>
> > This is a almost revert commit 93b43fa (oom: give the dying
> > task a higher priority).
> >
> > The commit dramatically improve oom killer logic when fork-bomb
> > occur. But, I've found it has nasty corner case. Now cpu cgroup
> > has strange default RT runtime. It's 0! That said, if a process
> > under cpu cgroup promote RT scheduling class, the process never
> > run at all.
>
> hm. How did that happen? I thought that sched_setscheduler() modifies
> only a single thread, and that thread is in the process of exiting?

If admin insert !RT process into a cpu cgroup of setting rtruntime=0,
usually it run perfectly because !RT task isn't affected from rtruntime
knob, but If it promote RT task, by explicit setscheduler() syscall or
OOM, the task can't run at all.

In short, now oom killer don't work at all if admin are using cpu
cgroup and don't touch rtruntime knob.