2009-04-09 00:45:33

by Nathan Lynch

[permalink] [raw]
Subject: [PATCH/RFC] do not count frozen tasks toward load

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions. If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications. This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs. Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen. This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch <[email protected]>
---
include/linux/sched.h | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 011db2f..f8af167 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -202,7 +202,8 @@ extern unsigned long long time_sync_thresh;
#define task_is_stopped_or_traced(task) \
((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
#define task_contributes_to_load(task) \
- ((task->state & TASK_UNINTERRUPTIBLE) != 0)
+ ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
+ (task->flags & PF_FROZEN) == 0)

#define __set_task_state(tsk, state_value) \
do { (tsk)->state = (state_value); } while (0)
--
1.6.0.6


2009-04-09 01:01:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH/RFC] do not count frozen tasks toward load

On Wed, 8 Apr 2009 19:45:12 -0500 Nathan Lynch <[email protected]> wrote:

> Freezing tasks via the cgroup freezer causes the load average to climb
> because the freezer's current implementation puts frozen tasks in
> uninterruptible sleep (D state).
>
> Some applications which perform job-scheduling functions consult the
> load average when making decisions. If a cgroup is frozen, the load
> average does not provide a useful measure of the system's utilization
> to such applications. This is especially inconvenient if the job
> scheduler employs the cgroup freezer as a mechanism for preempting low
> priority jobs. Contrast this with using SIGSTOP for the same purpose:
> the stopped tasks do not count toward system load.
>
> Change task_contributes_to_load() to return false if the task is
> frozen. This results in /proc/loadavg behavior that better meets
> users' expectations.
>
> Signed-off-by: Nathan Lynch <[email protected]>
> ---
> include/linux/sched.h | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 011db2f..f8af167 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -202,7 +202,8 @@ extern unsigned long long time_sync_thresh;
> #define task_is_stopped_or_traced(task) \
> ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
> #define task_contributes_to_load(task) \
> - ((task->state & TASK_UNINTERRUPTIBLE) != 0)
> + ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
> + (task->flags & PF_FROZEN) == 0)
>
> #define __set_task_state(tsk, state_value) \
> do { (tsk)->state = (state_value); } while (0)

Looks OK to me. It should perhaps use !frozen(task), but the includes
are mucked up.

I suppose we should fix this in -stable too.

2009-04-09 01:03:27

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH/RFC] do not count frozen tasks toward load

Hi.

On Wed, 2009-04-08 at 19:45 -0500, Nathan Lynch wrote:
> Freezing tasks via the cgroup freezer causes the load average to climb
> because the freezer's current implementation puts frozen tasks in
> uninterruptible sleep (D state).
>
> Some applications which perform job-scheduling functions consult the
> load average when making decisions. If a cgroup is frozen, the load
> average does not provide a useful measure of the system's utilization
> to such applications. This is especially inconvenient if the job
> scheduler employs the cgroup freezer as a mechanism for preempting low
> priority jobs. Contrast this with using SIGSTOP for the same purpose:
> the stopped tasks do not count toward system load.
>
> Change task_contributes_to_load() to return false if the task is
> frozen. This results in /proc/loadavg behavior that better meets
> users' expectations.

Sounds great to me - TuxOnIce has had code to save and restore the load
average for ages because of the same issue. This is much better because
it gets to the root of the problem.

I'll apply it here, give it a test and hopefully give you an Acked-by
shortly.

Regards,

Nigel

2009-04-09 01:22:15

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [PATCH/RFC] do not count frozen tasks toward load

Hi again.

On Wed, 2009-04-08 at 19:45 -0500, Nathan Lynch wrote:
> Freezing tasks via the cgroup freezer causes the load average to climb
> because the freezer's current implementation puts frozen tasks in
> uninterruptible sleep (D state).
>
> Some applications which perform job-scheduling functions consult the
> load average when making decisions. If a cgroup is frozen, the load
> average does not provide a useful measure of the system's utilization
> to such applications. This is especially inconvenient if the job
> scheduler employs the cgroup freezer as a mechanism for preempting low
> priority jobs. Contrast this with using SIGSTOP for the same purpose:
> the stopped tasks do not count toward system load.
>
> Change task_contributes_to_load() to return false if the task is
> frozen. This results in /proc/loadavg behavior that better meets
> users' expectations.
>
> Signed-off-by: Nathan Lynch <[email protected]>
> ---
> include/linux/sched.h | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 011db2f..f8af167 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -202,7 +202,8 @@ extern unsigned long long time_sync_thresh;
> #define task_is_stopped_or_traced(task) \
> ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
> #define task_contributes_to_load(task) \
> - ((task->state & TASK_UNINTERRUPTIBLE) != 0)
> + ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
> + (task->flags & PF_FROZEN) == 0)
>
> #define __set_task_state(tsk, state_value) \
> do { (tsk)->state = (state_value); } while (0)

Tested-by: Nigel Cunningham <[email protected]>

Looks good to me (though I like Andrew's point about using task_frozen).

nigel@nigel-laptop:~$ cat /proc/loadavg
0.34 0.27 0.12 1/251 9001
nigel@nigel-laptop:~$ sudo hibernate
nigel@nigel-laptop:~$ cat /proc/loadavg
0.52 0.33 0.14 2/250 9807
nigel@nigel-laptop:~$

2009-04-09 04:39:54

by Nathan Lynch

[permalink] [raw]
Subject: [tip:sched/urgent] sched: do not count frozen tasks toward load

Commit-ID: 34f2beeec5591259c43d195122de3cd26262d63b
Gitweb: http://git.kernel.org/tip/34f2beeec5591259c43d195122de3cd26262d63b
Author: Nathan Lynch <[email protected]>
AuthorDate: Wed, 8 Apr 2009 19:45:12 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 9 Apr 2009 06:09:49 +0200

sched: do not count frozen tasks toward load

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions. If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications. This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs. Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen. This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Acked-by: Nigel Cunningham <[email protected]>
Tested-by: Nigel Cunningham <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Matt Helsley <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
include/linux/sched.h | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 98e1fe5..b4c38bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -205,7 +205,8 @@ extern unsigned long long time_sync_thresh;
#define task_is_stopped_or_traced(task) \
((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
#define task_contributes_to_load(task) \
- ((task->state & TASK_UNINTERRUPTIBLE) != 0)
+ ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
+ (task->flags & PF_FROZEN) == 0)

#define __set_task_state(tsk, state_value) \
do { (tsk)->state = (state_value); } while (0)

2009-04-09 05:43:54

by Nathan Lynch

[permalink] [raw]
Subject: [tip:sched/urgent] sched: do not count frozen tasks toward load

Commit-ID: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd
Gitweb: http://git.kernel.org/tip/e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd
Author: Nathan Lynch <[email protected]>
AuthorDate: Wed, 8 Apr 2009 19:45:12 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 9 Apr 2009 07:37:02 +0200

sched: do not count frozen tasks toward load

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions. If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications. This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs. Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen. This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Acked-by: Nigel Cunningham <[email protected]>
Tested-by: Nigel Cunningham <[email protected]>
Cc: <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Matt Helsley <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
include/linux/sched.h | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 98e1fe5..b4c38bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -205,7 +205,8 @@ extern unsigned long long time_sync_thresh;
#define task_is_stopped_or_traced(task) \
((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
#define task_contributes_to_load(task) \
- ((task->state & TASK_UNINTERRUPTIBLE) != 0)
+ ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
+ (task->flags & PF_FROZEN) == 0)

#define __set_task_state(tsk, state_value) \
do { (tsk)->state = (state_value); } while (0)

2009-04-10 09:18:16

by Pavel Machek

[permalink] [raw]
Subject: Re: [tip:sched/urgent] sched: do not count frozen tasks toward load

On Thu 2009-04-09 05:39:32, Nathan Lynch wrote:
> Commit-ID: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd
> Gitweb: http://git.kernel.org/tip/e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd
> Author: Nathan Lynch <[email protected]>
> AuthorDate: Wed, 8 Apr 2009 19:45:12 -0500
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Thu, 9 Apr 2009 07:37:02 +0200
>
> sched: do not count frozen tasks toward load
>
> Freezing tasks via the cgroup freezer causes the load average to climb
> because the freezer's current implementation puts frozen tasks in
> uninterruptible sleep (D state).
>
> Some applications which perform job-scheduling functions consult the
> load average when making decisions. If a cgroup is frozen, the load
> average does not provide a useful measure of the system's utilization
> to such applications. This is especially inconvenient if the job
> scheduler employs the cgroup freezer as a mechanism for preempting low
> priority jobs. Contrast this with using SIGSTOP for the same purpose:
> the stopped tasks do not count toward system load.
>
> Change task_contributes_to_load() to return false if the task is
> frozen. This results in /proc/loadavg behavior that better meets
> users' expectations.
>
> Signed-off-by: Nathan Lynch <[email protected]>
> Acked-by: Andrew Morton <[email protected]>
> Acked-by: Nigel Cunningham <[email protected]>
> Tested-by: Nigel Cunningham <[email protected]>
> Cc: <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: Matt Helsley <[email protected]>
> LKML-Reference: <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>

Acked-by: Pavel Machek <[email protected]>

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html