2022-09-28 08:22:20

by Kassey Li

[permalink] [raw]
Subject: [PATCH v2] kernel/hung_task: add option to ignore task

By default, hung_task will iterate the tasklist and check
state in TASK_UNINTERRUPTIBLE with a given timeout value.

Here we add an option for task_struct so it can be ignored.
Set this flag to default true, it do not break the origin design.

This is useful when we set timeout value to 5s, where we just want
to detect some tasks interested.

Suggested-by: Naman Jain <[email protected]>
Signed-off-by: Kassey Li <[email protected]>
---
include/linux/sched.h | 1 +
kernel/fork.c | 1 +
kernel/hung_task.c | 3 ++-
3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e7b2f8a5c711..7c8596fea1f6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1071,6 +1071,7 @@ struct task_struct {
#ifdef CONFIG_DETECT_HUNG_TASK
unsigned long last_switch_count;
unsigned long last_switch_time;
+ bool hung_task_detect;
#endif
/* Filesystem information: */
struct fs_struct *fs;
diff --git a/kernel/fork.c b/kernel/fork.c
index 90c85b17bf69..5c461a37a26e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1552,6 +1552,7 @@ static int copy_mm(unsigned long clone_flags, struct task_struct *tsk)
#ifdef CONFIG_DETECT_HUNG_TASK
tsk->last_switch_count = tsk->nvcsw + tsk->nivcsw;
tsk->last_switch_time = 0;
+ tsk->hung_task_detect = 1;
#endif

tsk->mm = NULL;
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index bb2354f73ded..74bf4cef857f 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -119,7 +119,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
if (sysctl_hung_task_panic) {
console_verbose();
hung_task_show_lock = true;
- hung_task_call_panic = true;
+ if (t->hung_task_detect)
+ hung_task_call_panic = true;
}

/*
--
2.17.1


2022-09-28 13:54:31

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] kernel/hung_task: add option to ignore task

On Wed, Sep 28, 2022 at 03:48:41PM +0800, Kassey Li wrote:
> By default, hung_task will iterate the tasklist and check
> state in TASK_UNINTERRUPTIBLE with a given timeout value.
>
> Here we add an option for task_struct so it can be ignored.
> Set this flag to default true, it do not break the origin design.
>
> This is useful when we set timeout value to 5s, where we just want
> to detect some tasks interested.

What the hell for?

2022-09-29 03:01:32

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v2] kernel/hung_task: add option to ignore task

On Wed, Sep 28, 2022 at 03:48:41PM +0800, Kassey Li wrote:
> By default, hung_task will iterate the tasklist and check
> state in TASK_UNINTERRUPTIBLE with a given timeout value.
>
> Here we add an option for task_struct so it can be ignored.
> Set this flag to default true, it do not break the origin design.
>
> This is useful when we set timeout value to 5s, where we just want
> to detect some tasks interested.
>
> Suggested-by: Naman Jain <[email protected]>
> Signed-off-by: Kassey Li <[email protected]>
> ---
> include/linux/sched.h | 1 +
> kernel/fork.c | 1 +
> kernel/hung_task.c | 3 ++-
> 3 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index e7b2f8a5c711..7c8596fea1f6 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1071,6 +1071,7 @@ struct task_struct {
> #ifdef CONFIG_DETECT_HUNG_TASK
> unsigned long last_switch_count;
> unsigned long last_switch_time;
> + bool hung_task_detect;
> #endif
> /* Filesystem information: */
> struct fs_struct *fs;
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 90c85b17bf69..5c461a37a26e 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1552,6 +1552,7 @@ static int copy_mm(unsigned long clone_flags, struct task_struct *tsk)
> #ifdef CONFIG_DETECT_HUNG_TASK
> tsk->last_switch_count = tsk->nvcsw + tsk->nivcsw;
> tsk->last_switch_time = 0;
> + tsk->hung_task_detect = 1;
> #endif
>
> tsk->mm = NULL;
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index bb2354f73ded..74bf4cef857f 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -119,7 +119,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> if (sysctl_hung_task_panic) {
> console_verbose();
> hung_task_show_lock = true;
> - hung_task_call_panic = true;
> + if (t->hung_task_detect)
> + hung_task_call_panic = true;
> }
>
> /*
> --
> 2.17.1
>

This patch does not seems to be complete. Do you plan to provide an interface
to set/clear task_struct::hung_task_detect? Please explain the motivation and
the problems solved by this interface.

Thanks,
Pavan