2009-11-27 02:31:46

by Anton Blanchard

[permalink] [raw]
Subject: PATCH: softlockup: Fix hung_task_check_count sysctl


I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
this source of jitter I tried setting hung_task_check_count to 0:

# echo 0 > /proc/sys/kernel/hung_task_check_count

which didn't have the intended response. Change to a post increment of
max_count, so a value of 0 means check 0 tasks.

Signed-off-by: Anton Blanchard <[email protected]>
---

Index: linux.trees.git/kernel/hung_task.c
===================================================================
--- linux.trees.git.orig/kernel/hung_task.c 2009-11-27 13:11:46.000000000 +1100
+++ linux.trees.git/kernel/hung_task.c 2009-11-27 13:11:57.000000000 +1100
@@ -144,7 +144,7 @@ static void check_hung_uninterruptible_t

rcu_read_lock();
do_each_thread(g, t) {
- if (!--max_count)
+ if (!max_count--)
goto unlock;
if (!--batch_count) {
batch_count = HUNG_TASK_BATCHING;


2009-11-27 02:40:56

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Fri, Nov 27, 2009 at 01:28:20PM +1100, Anton Blanchard wrote:
>
> I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
> this source of jitter I tried setting hung_task_check_count to 0:
>
> # echo 0 > /proc/sys/kernel/hung_task_check_count
>
> which didn't have the intended response. Change to a post increment of
> max_count, so a value of 0 means check 0 tasks.
>
> Signed-off-by: Anton Blanchard <[email protected]>


Acked-by: Frederic Weisbecker <[email protected]>


> ---
>
> Index: linux.trees.git/kernel/hung_task.c
> ===================================================================
> --- linux.trees.git.orig/kernel/hung_task.c 2009-11-27 13:11:46.000000000 +1100
> +++ linux.trees.git/kernel/hung_task.c 2009-11-27 13:11:57.000000000 +1100
> @@ -144,7 +144,7 @@ static void check_hung_uninterruptible_t
>
> rcu_read_lock();
> do_each_thread(g, t) {
> - if (!--max_count)
> + if (!max_count--)
> goto unlock;
> if (!--batch_count) {
> batch_count = HUNG_TASK_BATCHING;

2009-11-27 02:46:48

by Cong Wang

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Fri, Nov 27, 2009 at 10:28 AM, Anton Blanchard <[email protected]> wrote:
>
> I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
> this source of jitter I tried setting hung_task_check_count to 0:
>
> # echo 0 > /proc/sys/kernel/hung_task_check_count
>
> which didn't have the intended response. Change to a post increment of
> max_count, so a value of 0 means check 0 tasks.
>
> Signed-off-by: Anton Blanchard <[email protected]>


Ack.

I would also suggest to make 'max_count' as unsigned long,
since sysctl_hung_task_check_count is.

Thanks.

> ---
>
> Index: linux.trees.git/kernel/hung_task.c
> ===================================================================
> --- linux.trees.git.orig/kernel/hung_task.c     2009-11-27 13:11:46.000000000 +1100
> +++ linux.trees.git/kernel/hung_task.c  2009-11-27 13:11:57.000000000 +1100
> @@ -144,7 +144,7 @@ static void check_hung_uninterruptible_t
>
>        rcu_read_lock();
>        do_each_thread(g, t) {
> -               if (!--max_count)
> +               if (!max_count--)
>                        goto unlock;
>                if (!--batch_count) {
>                        batch_count = HUNG_TASK_BATCHING;

2009-11-27 02:55:09

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Fri, Nov 27, 2009 at 10:46:53AM +0800, Am?rico Wang wrote:
> On Fri, Nov 27, 2009 at 10:28 AM, Anton Blanchard <[email protected]> wrote:
> >
> > I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
> > this source of jitter I tried setting hung_task_check_count to 0:
> >
> > # echo 0 > /proc/sys/kernel/hung_task_check_count
> >
> > which didn't have the intended response. Change to a post increment of
> > max_count, so a value of 0 means check 0 tasks.
> >
> > Signed-off-by: Anton Blanchard <[email protected]>
>
>
> Ack.
>
> I would also suggest to make 'max_count' as unsigned long,
> since sysctl_hung_task_check_count is.
>
> Thanks.


Also, the batch_count thing should be dropped I think.
This is a hardcoded, not overridable pause after 1024
threads checks to avoid latencies caused by rcu_read_lock.
But now we have PREEMPT_RCU so people can enable it if
they care about latency. We should remove it as it adds
unnecessary complexity.

I'm preparing a patch for that, on top of Anton patch.

2009-11-27 03:02:09

by Cong Wang

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Fri, Nov 27, 2009 at 10:55 AM, Frederic Weisbecker
<[email protected]> wrote:
> On Fri, Nov 27, 2009 at 10:46:53AM +0800, Américo Wang wrote:
>> On Fri, Nov 27, 2009 at 10:28 AM, Anton Blanchard <[email protected]> wrote:
>> >
>> > I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
>> > this source of jitter I tried setting hung_task_check_count to 0:
>> >
>> > # echo 0 > /proc/sys/kernel/hung_task_check_count
>> >
>> > which didn't have the intended response. Change to a post increment of
>> > max_count, so a value of 0 means check 0 tasks.
>> >
>> > Signed-off-by: Anton Blanchard <[email protected]>
>>
>>
>> Ack.
>>
>> I would also suggest to make 'max_count' as unsigned long,
>> since sysctl_hung_task_check_count is.
>>
>> Thanks.
>
>
> Also, the batch_count thing should be dropped I think.
> This is a hardcoded, not overridable pause after 1024
> threads checks to avoid latencies caused by rcu_read_lock.
> But now we have PREEMPT_RCU so people can enable it if
> they care about latency. We should remove it as it adds
> unnecessary complexity.

This sounds OK for me.

>
> I'm preparing a patch for that, on top of Anton patch.
>

Great!

Thanks.

2009-11-27 04:22:43

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Fri, Nov 27, 2009 at 11:02:13AM +0800, Am?rico Wang wrote:
> On Fri, Nov 27, 2009 at 10:55 AM, Frederic Weisbecker
> <[email protected]> wrote:
> > On Fri, Nov 27, 2009 at 10:46:53AM +0800, Am?rico Wang wrote:
> >> On Fri, Nov 27, 2009 at 10:28 AM, Anton Blanchard <[email protected]> wrote:
> >> >
> >> > I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
> >> > this source of jitter I tried setting hung_task_check_count to 0:
> >> >
> >> > # echo 0 > /proc/sys/kernel/hung_task_check_count
> >> >
> >> > which didn't have the intended response. Change to a post increment of
> >> > max_count, so a value of 0 means check 0 tasks.
> >> >
> >> > Signed-off-by: Anton Blanchard <[email protected]>
> >>
> >>
> >> Ack.
> >>
> >> I would also suggest to make 'max_count' as unsigned long,
> >> since sysctl_hung_task_check_count is.
> >>
> >> Thanks.
> >
> >
> > Also, the batch_count thing should be dropped I think.
> > This is a hardcoded, not overridable pause after 1024
> > threads checks to avoid latencies caused by rcu_read_lock.
> > But now we have PREEMPT_RCU so people can enable it if
> > they care about latency. We should remove it as it adds
> > unnecessary complexity.
>
> This sounds OK for me.
>
> >
> > I'm preparing a patch for that, on top of Anton patch.
> >
>
> Great!
>
> Thanks.


Actually it looks like this check is not only there to take
care of latency but also to avoid too much waiting before
a grace period.

Hm, well, I'm not sure what to do.

2009-11-27 05:49:18

by Anton Blanchard

[permalink] [raw]
Subject: [tip:core/softlockup] softlockup: Fix hung_task_check_count sysctl

Commit-ID: e5af02261668350b43eb7381648930bde8e872f7
Gitweb: http://git.kernel.org/tip/e5af02261668350b43eb7381648930bde8e872f7
Author: Anton Blanchard <[email protected]>
AuthorDate: Fri, 27 Nov 2009 13:28:20 +1100
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 27 Nov 2009 06:21:57 +0100

softlockup: Fix hung_task_check_count sysctl

I'm seeing spikes of up to 0.5ms in khungtaskd on a large
machine. To reduce this source of jitter I tried setting
hung_task_check_count to 0:

# echo 0 > /proc/sys/kernel/hung_task_check_count

which didn't have the intended response. Change to a post
increment of max_count, so a value of 0 means check 0 tasks.

Signed-off-by: Anton Blanchard <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: [email protected]
LKML-Reference: <20091127022820.GU32182@kryten>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/hung_task.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d4e8417..0c642d5 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -144,7 +144,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)

rcu_read_lock();
do_each_thread(g, t) {
- if (!--max_count)
+ if (!max_count--)
goto unlock;
if (!--batch_count) {
batch_count = HUNG_TASK_BATCHING;

2009-11-30 19:13:18

by Mandeep Baines

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

To disable hung_task at runtime:

# echo 0 > /proc/sys/kernel/hung_task_check_count

33 /*
34 * Zero means infinite timeout - no checking done:
35 */
36 unsigned long __read_mostly sysctl_hung_task_timeout_secs = 120;

For hung_task_check_count, 0 means there is no maximum so all tasks
are checked.

Anton Blanchard ([email protected]) wrote:
>
> I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
> this source of jitter I tried setting hung_task_check_count to 0:
>

Hmm, maybe khungtaskd should be run using SCHED_IDLE.

> # echo 0 > /proc/sys/kernel/hung_task_check_count
>
> which didn't have the intended response. Change to a post increment of
> max_count, so a value of 0 means check 0 tasks.
>
> Signed-off-by: Anton Blanchard <[email protected]>
> ---
>
> Index: linux.trees.git/kernel/hung_task.c
> ===================================================================
> --- linux.trees.git.orig/kernel/hung_task.c 2009-11-27 13:11:46.000000000 +1100
> +++ linux.trees.git/kernel/hung_task.c 2009-11-27 13:11:57.000000000 +1100
> @@ -144,7 +144,7 @@ static void check_hung_uninterruptible_t
>
> rcu_read_lock();
> do_each_thread(g, t) {
> - if (!--max_count)
> + if (!max_count--)
> goto unlock;
> if (!--batch_count) {
> batch_count = HUNG_TASK_BATCHING;

2009-11-30 20:21:20

by Mandeep Baines

[permalink] [raw]
Subject: Re: PATCH: softlockup: Fix hung_task_check_count sysctl

On Mon, Nov 30, 2009 at 11:13 AM, Mandeep Singh Baines <[email protected]> wrote:
> To disable hung_task at runtime:
>
> # echo 0 > /proc/sys/kernel/hung_task_check_count
>

D'oh. Copy and paste error. I meant

# echo 0 > /proc/sys/kernel/hung_task_timeout_secs

The nice thing about hung_task_timeout_secs=0 is that hungtaskd will
no longer run. Implementing
no checking done with hung_task_check_count=0 would result in the task
getting scheduled,
running, and then doing nothing.

> 33 /*
> 34 ?* Zero means infinite timeout - no checking done:
> 35 ?*/
> 36 unsigned long __read_mostly sysctl_hung_task_timeout_secs = 120;
>
> For hung_task_check_count, 0 means there is no maximum so all tasks
> are checked.
>
> Anton Blanchard ([email protected]) wrote:
>>
>> I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce
>> this source of jitter I tried setting hung_task_check_count to 0:
>>
>
> Hmm, maybe khungtaskd should be run using SCHED_IDLE.
>
>> # echo 0 > /proc/sys/kernel/hung_task_check_count
>>
>> which didn't have the intended response. Change to a post increment of
>> max_count, so a value of 0 means check 0 tasks.
>>
>> Signed-off-by: Anton Blanchard <[email protected]>
>> ---
>>
>> Index: linux.trees.git/kernel/hung_task.c
>> ===================================================================
>> --- linux.trees.git.orig/kernel/hung_task.c ? 2009-11-27 13:11:46.000000000 +1100
>> +++ linux.trees.git/kernel/hung_task.c ? ? ? ?2009-11-27 13:11:57.000000000 +1100
>> @@ -144,7 +144,7 @@ static void check_hung_uninterruptible_t
>>
>> ? ? ? rcu_read_lock();
>> ? ? ? do_each_thread(g, t) {
>> - ? ? ? ? ? ? if (!--max_count)
>> + ? ? ? ? ? ? if (!max_count--)
>> ? ? ? ? ? ? ? ? ? ? ? goto unlock;
>> ? ? ? ? ? ? ? if (!--batch_count) {
>> ? ? ? ? ? ? ? ? ? ? ? batch_count = HUNG_TASK_BATCHING;
>