2009-03-19 19:23:35

by Peter Lojkin

[permalink] [raw]
Subject: 2.6.28, limiting cpu time doesn't work

Hello,

after upgrade to 2.6.28 ulimit -t doesn't work. for example:

bash# ulimit -t 3; cpuhog

(where cpuhog is any program that continuously use cpu)
with 2.6.27.20 cpuhog gets killed after 3sec as expected.
with 2.6.28, 2.6.28.8, 2.6.29-rc8-git4 it's keep running indefinitely.
ulimit -a and /proc/<pid>/limits show that cputime limit was set correctly.

have i missed some global api changes? recompilation of recent glibc with new kernel headers and shell doesn't help...

ps please cc me, i'm not on the list


2009-03-22 20:18:51

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.28, limiting cpu time doesn't work

On 03/19, Peter Lojkin wrote:
>
> after upgrade to 2.6.28 ulimit -t doesn't work. for example:
>
> bash# ulimit -t 3; cpuhog
>
> (where cpuhog is any program that continuously use cpu)
> with 2.6.27.20 cpuhog gets killed after 3sec as expected.
> with 2.6.28, 2.6.28.8, 2.6.29-rc8-git4 it's keep running indefinitely.
> ulimit -a and /proc/<pid>/limits show that cputime limit was set correctly.

Found this message on http://bugzilla.kernel.org/show_bug.cgi?id=12911 ...

I _think_ posix_cpu_timers_init_group() is not right, it should copy
cputime_expires->prof_exp.

Peter, any chance you can test the (uncompiled/untested) patch below?

Also, I assume that something like

$ ulimit -t 3
$ while true; do true; done

kills the shell correctly, yes? IOW, I suspect that ulimit works, but
cpuhog never check RLIMIT_CPU because fastpath_timer_check() always
returns 0 due to task_cputime_zero(&sig->cputime_expires) == T.

I'm afraid we need the fix fo 2.6.29 as well, but I am looking at rc3.

Hmm. check_process_timers() updates ->cputime_expires, but it never
clears (say) cputime_expires.prof_exp, why? Can't we just do

if (cputime_gt(sig->cputime_expires.prof_exp, prof_expires))
sig->cputime_expires.prof_exp = prof_expires;
at the end?

Oleg.

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -790,9 +790,7 @@ static void posix_cpu_timers_init_group(
sig->it_prof_incr = cputime_zero;

/* Cached expiration times. */
- sig->cputime_expires.prof_exp = cputime_zero;
- sig->cputime_expires.virt_exp = cputime_zero;
- sig->cputime_expires.sched_exp = 0;
+ sig->cputime_expires = current->signal->cputime_expires;

/* The timer lists. */
INIT_LIST_HEAD(&sig->cpu_timers[0]);

2009-03-22 23:15:22

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.28, limiting cpu time doesn't work

On 03/23, Peter Lojkin wrote:
>
> Oleg Nesterov wrote:
>
> > Found this message on http://bugzilla.kernel.org/show_bug.cgi?id=12911 ...
> >
> > I _think_ posix_cpu_timers_init_group() is not right, it should copy
> > cputime_expires->prof_exp.
> >
> > Peter, any chance you can test the (uncompiled/untested) patch below?
>
> yes, with this patch 2.6.28.8 works as expected, thank you!

Great, thanks!

> if you need to test any more patches on the subject i'm ready to do it.
> regression test system for our project depends on ability to limit cpu time,
> so it's major problem for us...

I am not sure what should we do, this needs more discussion.

Probably the most simple patch for -stable and 2.6.29 is below.
(with this patch we don't even need update_rlimit_cpu(), afaics).

Oleg.

--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1263,7 +1263,8 @@ static inline int fastpath_timer_check(s
if (task_cputime_expired(&group_sample, &sig->cputime_expires))
return 1;
}
- return 0;
+
+ return sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY;
}

/*

2009-03-23 16:44:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.28, limiting cpu time doesn't work


* Oleg Nesterov <[email protected]> wrote:

> On 03/23, Peter Lojkin wrote:
> >
> > Oleg Nesterov wrote:
> >
> > > Found this message on http://bugzilla.kernel.org/show_bug.cgi?id=12911 ...
> > >
> > > I _think_ posix_cpu_timers_init_group() is not right, it should copy
> > > cputime_expires->prof_exp.
> > >
> > > Peter, any chance you can test the (uncompiled/untested) patch below?
> >
> > yes, with this patch 2.6.28.8 works as expected, thank you!
>
> Great, thanks!
>
> > if you need to test any more patches on the subject i'm ready to do it.
> > regression test system for our project depends on ability to limit cpu time,
> > so it's major problem for us...
>
> I am not sure what should we do, this needs more discussion.
>
> Probably the most simple patch for -stable and 2.6.29 is below.
> (with this patch we don't even need update_rlimit_cpu(), afaics).
>
> Oleg.
>
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -1263,7 +1263,8 @@ static inline int fastpath_timer_check(s
> if (task_cputime_expired(&group_sample, &sig->cputime_expires))
> return 1;
> }
> - return 0;
> +
> + return sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY;

Once this version of the patch is confirmed to fix the bug, please
send a full patch with a full changelog, SOB, Reported-by and
Tested-by lines.

Thanks,

Ingo

2009-03-23 19:39:20

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH, for 2.6.29] BUG 12911: fix RLIMIT_CPU && fork()

See http://bugzilla.kernel.org/show_bug.cgi?id=12911

copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other cpu timers.

This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
optimal, we need further cleanups here. With this patch update_rlimit_cpu()
is not really needed, but I don't think it should be removed.

The proper fix (I think) is:

- set_process_cpu_timer() should just start the cputimer->running
logic (it does), no need to change cputime_expires.xxx_exp

- posix_cpu_timers_init_group() should set ->running when needed

- fastpath_timer_check() can check ->running instead of
task_cputime_zero(signal->cputime_expires)

Reported-by: Peter Lojkin <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>

--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1370,7 +1370,8 @@ static inline int fastpath_timer_check(s
if (task_cputime_expired(&group_sample, &sig->cputime_expires))
return 1;
}
- return 0;
+
+ return sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY;
}

/*

2009-03-23 19:46:51

by Oleg Nesterov

[permalink] [raw]
Subject: [tip:timers/urgent] posix timers: fix RLIMIT_CPU && fork()

Commit-ID: 37bebc70d7ad4144c571d74500db3bb26ec0c0eb
Gitweb: http://git.kernel.org/tip/37bebc70d7ad4144c571d74500db3bb26ec0c0eb
Author: Oleg Nesterov <[email protected]>
AuthorDate: Mon, 23 Mar 2009 20:34:11 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 Mar 2009 20:43:35 +0100

posix timers: fix RLIMIT_CPU && fork()

See http://bugzilla.kernel.org/show_bug.cgi?id=12911

copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other cpu timers.

This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
optimal, we need further cleanups here. With this patch update_rlimit_cpu()
is not really needed, but I don't think it should be removed.

The proper fix (I think) is:

- set_process_cpu_timer() should just start the cputimer->running
logic (it does), no need to change cputime_expires.xxx_exp

- posix_cpu_timers_init_group() should set ->running when needed

- fastpath_timer_check() can check ->running instead of
task_cputime_zero(signal->cputime_expires)

Reported-by: Peter Lojkin <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Roland McGrath <[email protected]>
Cc: <[email protected]> [for 2.6.29.x]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/posix-cpu-timers.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index e976e50..8e5d9a6 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1370,7 +1370,8 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
if (task_cputime_expired(&group_sample, &sig->cputime_expires))
return 1;
}
- return 0;
+
+ return sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY;
}

/*

2009-03-24 18:31:19

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [tip:timers/urgent] posix timers: fix RLIMIT_CPU && fork()

On 03/23, Oleg Nesterov wrote:
>
> Commit-ID: 37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> Gitweb: http://git.kernel.org/tip/37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> Author: Oleg Nesterov <[email protected]>
> AuthorDate: Mon, 23 Mar 2009 20:34:11 +0100
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Mon, 23 Mar 2009 20:43:35 +0100
>
> posix timers: fix RLIMIT_CPU && fork()
>
> See http://bugzilla.kernel.org/show_bug.cgi?id=12911
>
> copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
> posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
> fastpath_timer_check() returns false unless we have other cpu timers.
>
> This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
> optimal,

Ingo, please drop this patch, it is very suboptimal.

My intent was to make the obviously correct patch for 2.6.29, but since
it was already released I'll send another one.

And,

> we need further cleanups here. With this patch update_rlimit_cpu()
> is not really needed, but I don't think it should be removed.
>
> The proper fix (I think) is:
>
> - set_process_cpu_timer() should just start the cputimer->running
> logic (it does), no need to change cputime_expires.xxx_exp

I am stupid, of course we should set cputime_expires.xxx_exp to avoid
the slow path in run_posix_cpu_timers().

Oleg.

2009-03-24 21:06:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:timers/urgent] posix timers: fix RLIMIT_CPU && fork()


* Oleg Nesterov <[email protected]> wrote:

> On 03/23, Oleg Nesterov wrote:
> >
> > Commit-ID: 37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> > Gitweb: http://git.kernel.org/tip/37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> > Author: Oleg Nesterov <[email protected]>
> > AuthorDate: Mon, 23 Mar 2009 20:34:11 +0100
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Mon, 23 Mar 2009 20:43:35 +0100
> >
> > posix timers: fix RLIMIT_CPU && fork()
> >
> > See http://bugzilla.kernel.org/show_bug.cgi?id=12911
> >
> > copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
> > posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
> > fastpath_timer_check() returns false unless we have other cpu timers.
> >
> > This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
> > optimal,
>
> Ingo, please drop this patch, it is very suboptimal.

suboptimal why?

> My intent was to make the obviously correct patch for 2.6.29, but
> since it was already released I'll send another one.

Please send it - thanks.

Ingo

2009-03-24 21:38:46

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [tip:timers/urgent] posix timers: fix RLIMIT_CPU && fork()

On 03/24, Ingo Molnar wrote:
>
> * Oleg Nesterov <[email protected]> wrote:
>
> > On 03/23, Oleg Nesterov wrote:
> > >
> > > Commit-ID: 37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> > > Gitweb: http://git.kernel.org/tip/37bebc70d7ad4144c571d74500db3bb26ec0c0eb
> > > Author: Oleg Nesterov <[email protected]>
> > > AuthorDate: Mon, 23 Mar 2009 20:34:11 +0100
> > > Committer: Ingo Molnar <[email protected]>
> > > CommitDate: Mon, 23 Mar 2009 20:43:35 +0100
> > >
> > > posix timers: fix RLIMIT_CPU && fork()
> > >
> > > See http://bugzilla.kernel.org/show_bug.cgi?id=12911
> > >
> > > copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
> > > posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
> > > fastpath_timer_check() returns false unless we have other cpu timers.
> > >
> > > This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not
> > > optimal,
> >
> > Ingo, please drop this patch, it is very suboptimal.
>
> suboptimal why?

Because this patch provokes the slow path on every tick if this process
has rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY, even if RLIMIT_CPU is
not expired yet.

So I think the initial patch I sent (which modifies copy_signal) is better,
but first I'd like to re-check the code once again.

Oleg.

2009-03-22 22:13:13

by Peter Lojkin

[permalink] [raw]
Subject: Re: 2.6.28, limiting cpu time doesn't work

Oleg Nesterov wrote:

> Found this message on http://bugzilla.kernel.org/show_bug.cgi?id=12911 ...
>
> I _think_ posix_cpu_timers_init_group() is not right, it should copy
> cputime_expires->prof_exp.
>
> Peter, any chance you can test the (uncompiled/untested) patch below?

yes, with this patch 2.6.28.8 works as expected, thank you!

> Also, I assume that something like
>
> $ ulimit -t 3
> $ while true; do true; done
>
> kills the shell correctly, yes?

yes, i've found it after original message and posted it in follow up:
http://marc.info/?l=linux-kernel&m=123770019023095&w=4

> IOW, I suspect that ulimit works, but
> cpuhog never check RLIMIT_CPU because fastpath_timer_check() always
> returns 0 due to task_cputime_zero(&sig->cputime_expires) == T.
>
> I'm afraid we need the fix fo 2.6.29 as well, but I am looking at rc3.
>
> Hmm. check_process_timers() updates ->cputime_expires, but it never
> clears (say) cputime_expires.prof_exp, why? Can't we just do
>
> if (cputime_gt(sig->cputime_expires.prof_exp, prof_expires))
> sig->cputime_expires.prof_exp = prof_expires;
> at the end?

if you need to test any more patches on the subject i'm ready to do it.
regression test system for our project depends on ability to limit cpu time,
so it's major problem for us...

> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -790,9 +790,7 @@ static void posix_cpu_timers_init_group(
> sig->it_prof_incr = cputime_zero;
>
> /* Cached expiration times. */
> - sig->cputime_expires.prof_exp = cputime_zero;
> - sig->cputime_expires.virt_exp = cputime_zero;
> - sig->cputime_expires.sched_exp = 0;
> + sig->cputime_expires = current->signal->cputime_expires;
>
> /* The timer lists. */
> INIT_LIST_HEAD(&sig->cpu_timers[0]);
>

2009-03-24 02:44:11

by Peter Lojkin

[permalink] [raw]
Subject: Re: 2.6.28, limiting cpu time doesn't work

Oleg Nesterov wrote:

> I am not sure what should we do, this needs more discussion.
>
> Probably the most simple patch for -stable and 2.6.29 is below.
> (with this patch we don't even need update_rlimit_cpu(), afaics).

tested on both 2.6.28.9 and 2.6.29 - RLIMIT_CPU works as expected

> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -1263,7 +1263,8 @@ static inline int fastpath_timer_check(s
> if (task_cputime_expired(&group_sample, &sig->cputime_expires))
> return 1;
> }
> - return 0;
> +
> + return sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY;
> }
>
> /*