2022-12-27 16:28:24

by Tio Zhang

[permalink] [raw]
Subject: [PATCH] sched: print parent comm in sched_show_task()

Knowing who the parent is might be useful for debugging.
For example, we can sometimes resolve kernel hung tasks by stopping
the person who begins those hung tasks.
With the parent's name printed in sched_show_task(),
it might be helpful to let people know which "service" should be operated.

Signed-off-by: Tio Zhang <[email protected]>
---
kernel/sched/core.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cb2aa2b54c7a..6f4aef0fed58 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8854,6 +8854,7 @@ void sched_show_task(struct task_struct *p)
{
unsigned long free = 0;
int ppid;
+ char pcomm[TASK_COMM_LEN];

if (!try_get_task_stack(p))
return;
@@ -8867,11 +8868,13 @@ void sched_show_task(struct task_struct *p)
#endif
ppid = 0;
rcu_read_lock();
- if (pid_alive(p))
+ if (pid_alive(p)) {
ppid = task_pid_nr(rcu_dereference(p->real_parent));
+ get_task_comm(pcomm, rcu_dereference(p->real_parent));
+ }
rcu_read_unlock();
- pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
- free, task_pid_nr(p), ppid,
+ pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d parent:%-15.15s flags:0x%08lx\n",
+ free, task_pid_nr(p), ppid, pcomm,
read_task_thread_flags(p));

print_worker_info(KERN_INFO, p);
--
2.17.1


2022-12-29 04:29:23

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched: print parent comm in sched_show_task()

On 2022-12-28 at 00:14:00 +0800, Tio Zhang wrote:
> Knowing who the parent is might be useful for debugging.
> For example, we can sometimes resolve kernel hung tasks by stopping
> the person who begins those hung tasks.
> With the parent's name printed in sched_show_task(),
> it might be helpful to let people know which "service" should be operated.
>
> Signed-off-by: Tio Zhang <[email protected]>
> ---
> kernel/sched/core.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index cb2aa2b54c7a..6f4aef0fed58 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8854,6 +8854,7 @@ void sched_show_task(struct task_struct *p)
> {
> unsigned long free = 0;
> int ppid;
> + char pcomm[TASK_COMM_LEN];
>
> if (!try_get_task_stack(p))
> return;
> @@ -8867,11 +8868,13 @@ void sched_show_task(struct task_struct *p)
> #endif
> ppid = 0;
> rcu_read_lock();
> - if (pid_alive(p))
> + if (pid_alive(p)) {
> ppid = task_pid_nr(rcu_dereference(p->real_parent));
> + get_task_comm(pcomm, rcu_dereference(p->real_parent));
Maybe struct task_struct *parent = rcu_dereference(p->real_parent);
and use parent directly to get its pid and comm?
Maybe off-topic, what if the parent is a kernel thread/worker? It might have extra
name information such as kthread->full_name or worker->desc according to proc_task_name().

thanks,

2023-01-04 13:33:53

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] sched: print parent comm in sched_show_task()

On Wed 2022-12-28 00:14:00, Tio Zhang wrote:
> Knowing who the parent is might be useful for debugging.
> For example, we can sometimes resolve kernel hung tasks by stopping
> the person who begins those hung tasks.
> With the parent's name printed in sched_show_task(),
> it might be helpful to let people know which "service" should be operated.

> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8854,6 +8854,7 @@ void sched_show_task(struct task_struct *p)
> {
> unsigned long free = 0;
> int ppid;
> + char pcomm[TASK_COMM_LEN];
>
> if (!try_get_task_stack(p))
> return;
> @@ -8867,11 +8868,13 @@ void sched_show_task(struct task_struct *p)
> #endif
> ppid = 0;

We need to intialized pcomm here:

pcomm[0] = '\0';

Otherwise, it would include a garbage when pid_alive(p) returns false below..

> rcu_read_lock();
> - if (pid_alive(p))
> + if (pid_alive(p)) {
> ppid = task_pid_nr(rcu_dereference(p->real_parent));
> + get_task_comm(pcomm, rcu_dereference(p->real_parent));
> + }
> rcu_read_unlock();
> - pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
> - free, task_pid_nr(p), ppid,
> + pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d parent:%-15.15s
> flags:0x%08lx\n",

It would print: .... parent:xxx flags:0000

Some people might be confused whether the flags are from
the task or from the parent.

A solution would be to move the parent value to another line.
It would even better solve the situation when the task
is not alive and we could not get information about the parent:

if (pid_alive(p)) {
struct parent = rcu_dereference(p->real_parent);

pr_info("parent:%-15.15s ppid:%-6d\n",
parent->comm, task_pid_nr(parent));
}

> + free, task_pid_nr(p), ppid, pcomm,
> read_task_thread_flags(p));
>
> print_worker_info(KERN_INFO, p);

Best Regards,
Petr

2023-01-04 14:27:01

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] sched: print parent comm in sched_show_task()

On Wed 2023-01-04 01:51:30, 张元瀚 wrote:
> Hi Chen,
> Thanks for your advice!
>
> > Maybe struct task_struct *parent = rcu_dereference(p->real_parent);
> > and use parent directly to get its pid and comm?
>
> Yes! It is good to write this way.
>
> > Maybe off-topic, what if the parent is a kernel thread/worker? It might
> have extra
> > name information such as kthread->full_name or worker->desc according to
> proc_task_name().
>
> I'm not quite sure if it is necessary to fetch that extra information since
> our sched_show_task() prints p->comm ourselves.
> But, assuming we get the parent's name in the same way we get
> proc_task_name(), there are some new issues I'd like to discuss.

I suggest to keep it simple and just print "parent->comm".

Especially, we should avoid taking any lock. shed_show_task() might
be called when there already is a deadlock on the system. I guess
that it even can be called from NMI.

For example, see print_worker_info(). It uses
copy_from_kernel_nofault() to be safe without taking any lock.

Best Regards,
Petr

2023-01-06 10:25:29

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched: print parent comm in sched_show_task()

Hi Tio,
On 2023-01-04 at 01:51:30 +0800, 张元瀚 wrote:
> Hi Chen,
> Thanks for your advice!
>
> > Maybe struct task_struct *parent = rcu_dereference(p->real_parent);
> > and use parent directly to get its pid and comm?
>
> Yes! It is good to write this way.
>
> > Maybe off-topic, what if the parent is a kernel thread/worker? It might
> have extra
> > name information such as kthread->full_name or worker->desc according to
> proc_task_name().
>
> I'm not quite sure if it is necessary to fetch that extra information since
> our sched_show_task() prints p->comm ourselves.
> But, assuming we get the parent's name in the same way we get
> proc_task_name(), there are some new issues I'd like to discuss.
> For example, if we write like this:
Petr suggested to keep it simple in another thread, so I think we do not
need to let extra information involve in for now.

thanks,
Chenyu

2023-01-12 09:11:37

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched: print parent comm in sched_show_task()

Hi,
On 2023-01-10 at 17:35:34 +0800, Yuanhan Zhang wrote:
> Hi Petr and Chen,
> Thanks for your emails! Petr's suggestions sure make a lot of sense.
> Please review the following implementation, it would print a new line like:
> " parent:kthreadd ppid:2 "
> after the origin line.
>
> ---
>
> kernel/sched/core.c | 15 ++++++++-------
>
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>
> index cb2aa2b54c7a..86ee90404ddb 100644
>
> --- a/kernel/sched/core.c
>
> +++ b/kernel/sched/core.c
>
> @@ -8853,7 +8853,7 @@ SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t,
> pid,
>
> void sched_show_task(struct task_struct *p)
>
> {
>
> unsigned long free = 0;
>
> - int ppid;
>
> + struct task_struct *parent;
>
Above line could be moved inside: if (pid_alive(p)).
And the format of this patch seems to be broken, please resend after fixing the issues
reported by scripts/checkpatch.pl

thanks,
Chenyu