2006-03-19 19:29:12

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] simplify/fix first_tid()

first_tid:

/* If nr exceeds the number of threads there is nothing todo */
if (nr) {
if (nr >= get_nr_threads(leader))
goto done;
}

This is not reliable: sub-threads can exit after this check, so the
'for' loop below can overlap and proc_task_readdir() can return an
already filldir'ed dirents.

for (; pos && pid_alive(pos); pos = next_thread(pos)) {
if (--nr > 0)
continue;

Off-by-one error, will return 'leader' when nr == 1.

This patch tries to fix these problems and simplify the code.

Signed-off-by: Oleg Nesterov <[email protected]>

--- MM/fs/proc/base.c~ 2006-03-19 23:25:38.000000000 +0300
+++ MM/fs/proc/base.c 2006-03-20 00:01:12.000000000 +0300
@@ -2180,38 +2180,29 @@ int proc_pid_readdir(struct file * filp,
static struct task_struct *first_tid(struct task_struct *leader,
int tid, int nr)
{
- struct task_struct *pos = NULL;
+ struct task_struct *pos;

rcu_read_lock();
/* Attempt to start with the pid of a thread */
- if (tid && (nr > 0)) {
- pos = find_task_by_pid(tid);
- if (pos && (pos->group_leader != leader))
- pos = NULL;
- if (pos)
- nr = 0;
- }
-
- /* If nr exceeds the number of threads there is nothing todo */
- if (nr) {
- if (nr >= get_nr_threads(leader))
- goto done;
- }
-
- /* If we haven't found our starting place yet start with the
- * leader and walk nr threads forward.
- */
- if (!pos && (nr >= 0))
- pos = leader;
-
- for (; pos && pid_alive(pos); pos = next_thread(pos)) {
- if (--nr > 0)
- continue;
- get_task_struct(pos);
- goto done;
- }
- pos = NULL;
-done:
+ if (tid && (nr > 0)) {
+ pos = find_task_by_pid(tid);
+ if (pos && (pos->group_leader == leader))
+ goto found;
+ }
+
+ /* If we haven't found our starting place yet start
+ * with the leader and walk nr threads forward.
+ */
+ for (pos = leader; nr > 0; --nr) {
+ pos = next_thread(pos);
+ if (pos == leader) {
+ pos = NULL;
+ goto out;
+ }
+ }
+found:
+ get_task_struct(pos);
+out:
rcu_read_unlock();
return pos;
}


2006-03-20 18:01:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] simplify/fix first_tid()

Oleg Nesterov <[email protected]> writes:

> first_tid:
>
> /* If nr exceeds the number of threads there is nothing todo */
> if (nr) {
> if (nr >= get_nr_threads(leader))
> goto done;
> }
>
> This is not reliable: sub-threads can exit after this check, so the
> 'for' loop below can overlap and proc_task_readdir() can return an
> already filldir'ed dirents.
>
> for (; pos && pid_alive(pos); pos = next_thread(pos)) {
> if (--nr > 0)
> continue;
>
> Off-by-one error, will return 'leader' when nr == 1.
>
> This patch tries to fix these problems and simplify the code.

This is better however if I read this code correctly. It modifies
the code so the last time user space goes trough this loop
with nr > nr_threads. Then we will walk the entire threads
list to achieve nothing.

So we really still need the nr_threads test in there so we don't
traverse the list twice everytime through readdir.

Eric

2006-03-20 18:33:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] simplify/fix first_tid()

"Eric W. Biederman" wrote:
>
> Oleg Nesterov <[email protected]> writes:
>
> > first_tid:
> >
> > /* If nr exceeds the number of threads there is nothing todo */
> > if (nr) {
> > if (nr >= get_nr_threads(leader))
> > goto done;
> > }
> >
> > This is not reliable: sub-threads can exit after this check, so the
> > 'for' loop below can overlap and proc_task_readdir() can return an
> > already filldir'ed dirents.
> >
> > for (; pos && pid_alive(pos); pos = next_thread(pos)) {
> > if (--nr > 0)
> > continue;
> >
> > Off-by-one error, will return 'leader' when nr == 1.
> >
> > This patch tries to fix these problems and simplify the code.
>
> This is better however if I read this code correctly. It modifies
> the code so the last time user space goes trough this loop
> with nr > nr_threads. Then we will walk the entire threads
> list to achieve nothing.

This can happen only if the thread we stopped at has exited, and
some other threads have exited too, so that nr >= ->signal->count.

I think it's not worth optimizing this rare and anyway slow path.
However, you are the code author, I'll send a trivial patch which
restores this optimization if you don't change you mind.

> So we really still need the nr_threads test in there so we don't
> traverse the list twice everytime through readdir.

How so? We don't do it twice?

Oleg.

2006-03-20 18:57:50

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] simplify/fix first_tid()

Oleg Nesterov <[email protected]> writes:

> "Eric W. Biederman" wrote:
>> This is better however if I read this code correctly. It modifies
>> the code so the last time user space goes trough this loop
>> with nr > nr_threads. Then we will walk the entire threads
>> list to achieve nothing.
>
> This can happen only if the thread we stopped at has exited, and
> some other threads have exited too, so that nr >= ->signal->count.
>
> I think it's not worth optimizing this rare and anyway slow path.
> However, you are the code author, I'll send a trivial patch which
> restores this optimization if you don't change you mind.
>
>> So we really still need the nr_threads test in there so we don't
>> traverse the list twice everytime through readdir.
>
> How so? We don't do it twice?

In general user space does. Because a read of 0 bytes signifies
the end of a directory.

So we have 2 trips through proc_task_readdir initiated by user
space.

Eric

2006-03-20 19:35:41

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] simplify/fix first_tid()

"Eric W. Biederman" wrote:
>
> Oleg Nesterov <[email protected]> writes:
>
> >> So we really still need the nr_threads test in there so we don't
> >> traverse the list twice everytime through readdir.
> >
> > How so? We don't do it twice?
>
> In general user space does. Because a read of 0 bytes signifies
> the end of a directory.
>
> So we have 2 trips through proc_task_readdir initiated by user
> space.

Oh, thanks, you are right.

[PATCH] simplify-fix-first_tid-fix

Restore a stupidly deleted optimization.

Signed-off-by: Oleg Nesterov <[email protected]>

--- MM/fs/proc/base.c~ 2006-03-21 01:08:10.000000000 +0300
+++ MM/fs/proc/base.c 2006-03-21 01:14:36.000000000 +0300
@@ -2190,6 +2190,11 @@ static struct task_struct *first_tid(str
goto found;
}

+ /* If nr exceeds the number of threads there is nothing todo */
+ pos = NULL;
+ if (nr && nr >= get_nr_threads(leader))
+ goto out;
+
/* If we haven't found our starting place yet start
* with the leader and walk nr threads forward.
*/