2021-03-09 16:20:31

by Jim Newsome

[permalink] [raw]
Subject: [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

do_wait is an internal function used to implement waitpid, waitid,
wait4, etc. To handle the general case, it does an O(n) linear scan of
the thread group's children and tracees.

This patch adds a special-case when waiting on a pid to skip these scans
and instead do an O(1) lookup. This improves performance when waiting on
a pid from a thread group with many children and/or tracees.

Signed-off-by: James Newsome
---
kernel/exit.c | 54 ++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e35e69a..312c4dfc9555 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1439,6 +1439,33 @@ void __wake_up_parent(struct task_struct *p, struct task_struct *parent)
TASK_INTERRUPTIBLE, p);
}

+// Optimization for waiting on PIDTYPE_PID. No need to iterate through child
+// and tracee lists to find the target task.
+static int do_wait_pid(struct wait_opts *wo, struct task_struct *tsk)
+{
+ struct task_struct *target = pid_task(wo->wo_pid, PIDTYPE_PID);
+ if (!target) {
+ return 0;
+ }
+ if (tsk == target->real_parent ||
+ (!(wo->wo_flags & __WNOTHREAD) &&
+ same_thread_group(tsk, target->real_parent))) {
+ int retval = wait_consider_task(wo, /* ptrace= */ 0, target);
+ if (retval) {
+ return retval;
+ }
+ }
+ if (target->ptrace && (tsk == target->parent ||
+ (!(wo->wo_flags & __WNOTHREAD) &&
+ same_thread_group(tsk, target->parent)))) {
+ int retval = wait_consider_task(wo, /* ptrace= */ 1, target);
+ if (retval) {
+ return retval;
+ }
+ }
+ return 0;
+}
+
static long do_wait(struct wait_opts *wo)
{
struct task_struct *tsk;
@@ -1464,18 +1491,27 @@ static long do_wait(struct wait_opts *wo)
set_current_state(TASK_INTERRUPTIBLE);
read_lock(&tasklist_lock);
tsk = current;
- do {
- retval = do_wait_thread(wo, tsk);
- if (retval)
- goto end;

- retval = ptrace_do_wait(wo, tsk);
- if (retval)
+ if (wo->wo_type == PIDTYPE_PID) {
+ retval = do_wait_pid(wo, tsk);
+ if (retval) {
goto end;
+ }
+ } else {
+ do {
+ retval = do_wait_thread(wo, tsk);
+ if (retval)
+ goto end;

- if (wo->wo_flags & __WNOTHREAD)
- break;
- } while_each_thread(current, tsk);
+ retval = ptrace_do_wait(wo, tsk);
+ if (retval)
+ goto end;
+
+ if (wo->wo_flags & __WNOTHREAD)
+ break;
+ }
+ while_each_thread(current, tsk);
+ }
read_unlock(&tasklist_lock);

notask:
--
2.30.1


2021-03-09 17:17:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

Jim,

Thanks, the patch looks good to me. Yet I think you need to send V3 even
if I personally do not care ;) Please consider ./scripts/checkpatch.pl,
it reports all the coding-style problems I was going to mention.

I too have a couple of cosmetic nits, but feel free to ignore, this is
subjective.

On 03/09, Jim Newsome wrote:
>
> do_wait is an internal function used to implement waitpid, waitid,
> wait4, etc. To handle the general case, it does an O(n) linear scan of
> the thread group's children and tracees.
>
> This patch adds a special-case when waiting on a pid to skip these scans
> and instead do an O(1) lookup. This improves performance when waiting on
> a pid from a thread group with many children and/or tracees.
>
> Signed-off-by: James Newsome
> ---
> kernel/exit.c | 54 ++++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 45 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 04029e35e69a..312c4dfc9555 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1439,6 +1439,33 @@ void __wake_up_parent(struct task_struct *p, struct task_struct *parent)
> TASK_INTERRUPTIBLE, p);
> }
>
> +// Optimization for waiting on PIDTYPE_PID. No need to iterate through child
> +// and tracee lists to find the target task.
> +static int do_wait_pid(struct wait_opts *wo, struct task_struct *tsk)
> +{
> + struct task_struct *target = pid_task(wo->wo_pid, PIDTYPE_PID);
> + if (!target) {
> + return 0;
> + }
> + if (tsk == target->real_parent ||
> + (!(wo->wo_flags & __WNOTHREAD) &&
> + same_thread_group(tsk, target->real_parent))) {
> + int retval = wait_consider_task(wo, /* ptrace= */ 0, target);
> + if (retval) {
> + return retval;
> + }
> + }
> + if (target->ptrace && (tsk == target->parent ||
> + (!(wo->wo_flags & __WNOTHREAD) &&
> + same_thread_group(tsk, target->parent)))) {
> + int retval = wait_consider_task(wo, /* ptrace= */ 1, target);
> + if (retval) {
> + return retval;
> + }
> + }

Both if's use "int retval", to me it would be better to declare this variable
at the start of do_wait_pid(). But again, I won't insist this is up to you.

I am wondering if something like

static inline bool is_parent(struct task_struct *tsk, struct task_struct *p, int flags)
{
return tsk == p || !(flags & __WNOTHREAD)) && same_thread_group(tsk, p);
}

makes any sense to make do_wait_pid() more clear... probably not.

Oleg.

2021-03-09 20:44:20

by Jim Newsome

[permalink] [raw]
Subject: Re: [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

On 3/9/21 11:15, Oleg Nesterov wrote:
> Jim,
>
> Thanks, the patch looks good to me. Yet I think you need to send V3 even
> if I personally do not care ;) Please consider ./scripts/checkpatch.pl,
> it reports all the coding-style problems I was going to mention.

Thanks! I'd thought clang-format with the included configuration would
be sufficient, but apparently not :)

> Both if's use "int retval", to me it would be better to declare this variable
> at the start of do_wait_pid(). But again, I won't insist this is up to you.
My usual inclination is to avoid uninitialized variables and prefer
putting them in tighter scopes. I don't think it's really much of an
issue in this relatively short function though; happy to go with the
prevailing style.
> I am wondering if something like
>
> static inline bool is_parent(struct task_struct *tsk, struct task_struct *p, int flags)
> {
> return tsk == p || !(flags & __WNOTHREAD)) && same_thread_group(tsk, p);
> }
>
> makes any sense to make do_wait_pid() more clear... probably not.

Yeah, I lean slightly towards the extra level of indirection not being
worth the deduplication.

I made a couple other small changes as well:

* No need for do_wait_pid to take the parameter `tsk` since it's only
ever called with `current`

* With that change, the declaration of `tsk` in `do_wait` can be moved
into a tighter scope of where it's used in the loop.

v3: https://lkml.org/lkml/2021/3/9/1134