2022-09-23 09:33:40

by Kassey Li

[permalink] [raw]
Subject: [PATCH] kernel/hung_task: add option to ignore task

By default, hung_task will iterate the tasklist and check
state in TASK_UNINTERRUPTIBLE with a given timeout value.

Some tasks may in this state as expected but reported by hung_task.
An example task and trace:

[khungtaskd]Task SettingsProvide:2954 blocked for 90s is causing
panic

[<ffffffd6602f7470>] __switch_to+0x334
[<ffffffd661c2f5b8>] __schedule+0x5e8
[<ffffffd661c2f9f0>] schedule+0x9c
[<ffffffd661c33da8>] schedule_hrtimeout_range_clock+0xd0
[<ffffffd6605ef390>] do_epoll_wait+0x3c0
[<ffffffd6605edd64>] __arm64_sys_epoll_pwait+0x48
[<ffffffd660307494>] el0_svc_common+0xb4
[<ffffffd6603073c4>] el0_svc_handler+0x6c
[<ffffffd660084988>] el0_svc+0x8

Here we add an option for task_struct so it can be ignored.
Set this flag to default true, it do not break the origin desgin.

Suggested-by: Naman Jain <[email protected]>
Signed-off-by: Kassey Li <[email protected]>
---
include/linux/sched.h | 1 +
kernel/fork.c | 1 +
kernel/hung_task.c | 3 ++-
3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e7b2f8a5c711..7c8596fea1f6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1071,6 +1071,7 @@ struct task_struct {
#ifdef CONFIG_DETECT_HUNG_TASK
unsigned long last_switch_count;
unsigned long last_switch_time;
+ bool hung_task_detect;
#endif
/* Filesystem information: */
struct fs_struct *fs;
diff --git a/kernel/fork.c b/kernel/fork.c
index 90c85b17bf69..5c461a37a26e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1552,6 +1552,7 @@ static int copy_mm(unsigned long clone_flags, struct task_struct *tsk)
#ifdef CONFIG_DETECT_HUNG_TASK
tsk->last_switch_count = tsk->nvcsw + tsk->nivcsw;
tsk->last_switch_time = 0;
+ tsk->hung_task_detect = 1;
#endif

tsk->mm = NULL;
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index bb2354f73ded..74bf4cef857f 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -119,7 +119,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
if (sysctl_hung_task_panic) {
console_verbose();
hung_task_show_lock = true;
- hung_task_call_panic = true;
+ if (t->hung_task_detect)
+ hung_task_call_panic = true;
}

/*
--
2.17.1


2022-09-23 11:23:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task: add option to ignore task

On Fri, Sep 23, 2022 at 04:53:35PM +0800, Kassey Li wrote:
> By default, hung_task will iterate the tasklist and check
> state in TASK_UNINTERRUPTIBLE with a given timeout value.
>
> Some tasks may in this state as expected but reported by hung_task.

Please explain..

2022-09-27 02:16:56

by Kassey Li

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task: add option to ignore task



On 9/23/2022 7:04 PM, Peter Zijlstra wrote:
> On Fri, Sep 23, 2022 at 04:53:35PM +0800, Kassey Li wrote:
>> By default, hung_task will iterate the tasklist and check
>> state in TASK_UNINTERRUPTIBLE with a given timeout value.
>>
>> Some tasks may in this state as expected but reported by hung_task.
>
> Please explain..
I want to set timout value as 60s, 20s, 10s, or even 5s to more
aggressive to detect my VIP tasks "init", "surfaceflinger",
"system_server" for example as debug.

many other tasks wait for IO, mutex, delayed timer ...
will hit this while we want to ignore.