2006-03-07 20:22:49

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

depends on

pidhash-dont-count-idle-threads.patch
pidhash-kill-switch_exec_pids.patch

otherwise (I think) it is orthogonal to all tref/proc changes.

This patch kills PIDTYPE_TGID pid_type thus saving one hash table
in kernel/pid.c and speeding up subthreads create/destroy a bit.
It is also a preparation for the further tref/pids rework.

This patch adds 'struct list_head tgrp' to 'struct task_struct'
instead. Note that ->tgrp need not to be rcu safe.

We don't detach group leader from PIDTYPE_PID namespace until another
thread inherits it's ->pid == ->tgid, so we are safe wrt premature
free_pidmap(->tgid) call.

Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
Should the need arise, we can use find_task_by_pid()->group_leader.

include/linux/pid.h | 1 -
include/linux/sched.h | 10 +++++++---
kernel/exit.c | 10 +---------
kernel/fork.c | 4 +++-
4 files changed, 11 insertions(+), 14 deletions(-)

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.16-rc5/include/linux/pid.h~1_TGID 2006-03-07 18:06:49.000000000 +0300
+++ 2.6.16-rc5/include/linux/pid.h 2006-03-07 18:56:06.000000000 +0300
@@ -4,7 +4,6 @@
enum pid_type
{
PIDTYPE_PID,
- PIDTYPE_TGID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX
--- 2.6.16-rc5/include/linux/sched.h~1_TGID 2006-03-05 23:51:40.000000000 +0300
+++ 2.6.16-rc5/include/linux/sched.h 2006-03-07 18:52:18.000000000 +0300
@@ -748,6 +748,7 @@ struct task_struct {

/* PID/PID hash table linkage. */
struct pid pids[PIDTYPE_MAX];
+ struct list_head tgrp;

struct completion *vfork_done; /* for vfork() */
int __user *set_child_tid; /* CLONE_CHILD_SETTID */
@@ -1181,13 +1182,16 @@ extern void wait_task_inactive(task_t *
#define while_each_thread(g, t) \
while ((t = next_thread(t)) != g)

-extern task_t * FASTCALL(next_thread(const task_t *p));
-
#define thread_group_leader(p) (p->pid == p->tgid)

+static inline task_t *next_thread(task_t *p)
+{
+ return list_entry(p->tgrp.next, task_t, tgrp);
+}
+
static inline int thread_group_empty(task_t *p)
{
- return list_empty(&p->pids[PIDTYPE_TGID].pid_list);
+ return list_empty(&p->tgrp);
}

#define delay_group_leader(p) \
--- 2.6.16-rc5/kernel/exit.c~1_TGID 2006-03-01 22:00:30.000000000 +0300
+++ 2.6.16-rc5/kernel/exit.c 2006-03-07 19:09:20.000000000 +0300
@@ -49,7 +49,6 @@ static void __unhash_process(struct task
{
nr_threads--;
detach_pid(p, PIDTYPE_PID);
- detach_pid(p, PIDTYPE_TGID);
if (thread_group_leader(p)) {
detach_pid(p, PIDTYPE_PGID);
detach_pid(p, PIDTYPE_SID);
@@ -57,7 +56,7 @@ static void __unhash_process(struct task
list_del_init(&p->tasks);
__get_cpu_var(process_counts)--;
}
-
+ list_del(&p->tgrp);
remove_parent(p);
}

@@ -953,13 +952,6 @@ asmlinkage long sys_exit(int error_code)
do_exit((error_code&0xff)<<8);
}

-task_t fastcall *next_thread(const task_t *p)
-{
- return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);
-}
-
-EXPORT_SYMBOL(next_thread);
-
/*
* Take down every thread in the group. This is called by fatal signals
* as well as by sys_exit_group (below).
--- 2.6.16-rc5/kernel/fork.c~1_TGID 2006-03-01 22:00:30.000000000 +0300
+++ 2.6.16-rc5/kernel/fork.c 2006-03-07 19:05:54.000000000 +0300
@@ -1100,6 +1100,7 @@ static task_t *copy_process(unsigned lon
* We dont wake it up yet.
*/
p->group_leader = p;
+ INIT_LIST_HEAD(&p->tgrp);
INIT_LIST_HEAD(&p->ptrace_children);
INIT_LIST_HEAD(&p->ptrace_list);

@@ -1153,7 +1154,9 @@ static task_t *copy_process(unsigned lon
retval = -EAGAIN;
goto bad_fork_cleanup_namespace;
}
+
p->group_leader = current->group_leader;
+ list_add_tail(&p->tgrp, &current->tgrp);

if (current->signal->group_stop_count > 0) {
/*
@@ -1201,7 +1204,6 @@ static task_t *copy_process(unsigned lon
list_add_tail(&p->tasks, &init_task.tasks);
__get_cpu_var(process_counts)++;
}
- attach_pid(p, PIDTYPE_TGID, p->tgid);
attach_pid(p, PIDTYPE_PID, p->pid);
nr_threads++;
}


2006-03-07 22:43:49

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Oleg Nesterov <[email protected]> writes:

> depends on
>
> pidhash-dont-count-idle-threads.patch
> pidhash-kill-switch_exec_pids.patch
>
> otherwise (I think) it is orthogonal to all tref/proc changes.

You also depend on your recent change to call __unhash_process
under the sighand lock. Since the ->tgrp is protected by
the sighand lock.

> This patch kills PIDTYPE_TGID pid_type thus saving one hash table
> in kernel/pid.c and speeding up subthreads create/destroy a bit.
> It is also a preparation for the further tref/pids rework.
>
> This patch adds 'struct list_head tgrp' to 'struct task_struct'
> instead. Note that ->tgrp need not to be rcu safe.

Is there a reason for this? I think at least proc could easily
take advantage of an rcu safe implementation.

Hmm. At the moment proc is only taking the tasklist_lock during
traversal so we may have a problem with the list_add anyway.

Also could we name the member not ->tgrp but ->threads?
I keep half expecting tgrp to be a number, and have a hard
time not mispelling it tpgrp.

> We don't detach group leader from PIDTYPE_PID namespace until another
> thread inherits it's ->pid == ->tgid, so we are safe wrt premature
> free_pidmap(->tgid) call.

Agreed.

> Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
> Should the need arise, we can use find_task_by_pid()->group_leader.
>
> include/linux/pid.h | 1 -
> include/linux/sched.h | 10 +++++++---
> kernel/exit.c | 10 +---------
> kernel/fork.c | 4 +++-
> 4 files changed, 11 insertions(+), 14 deletions(-)

Eric

2006-03-08 14:44:23

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Eric W. Biederman wrote:
>
> Oleg Nesterov <[email protected]> writes:
>
> > depends on
> >
> > pidhash-dont-count-idle-threads.patch
> > pidhash-kill-switch_exec_pids.patch
> >
> > otherwise (I think) it is orthogonal to all tref/proc changes.
>
> You also depend on your recent change to call __unhash_process
> under the sighand lock. Since the ->tgrp is protected by
> the sighand lock.

Actually now I think it depends only on yours
pidhash-kill-switch_exec_pids.patch

This change (call __unhash_process under the sighand lock) does not
touch __unhash_process() itself, it only moves the callsite. So this
patch can go before or after this change. However it needs other
patches from -mm to avoid rejects.

> > This patch kills PIDTYPE_TGID pid_type thus saving one hash table
> > in kernel/pid.c and speeding up subthreads create/destroy a bit.
> > It is also a preparation for the further tref/pids rework.
> >
> > This patch adds 'struct list_head tgrp' to 'struct task_struct'
> > instead. Note that ->tgrp need not to be rcu safe.
>
> Is there a reason for this? I think at least proc could easily
> take advantage of an rcu safe implementation.
>
> Hmm. At the moment proc is only taking the tasklist_lock during
> traversal so we may have a problem with the list_add anyway.

tasklist or ->sighand is enough to do threads traversal. Yes,
adding _rcu suffix allows us to do it under rcu_read_lock().
However the thread group will not be stable during this traversal,
we can miss some newly created thread or hit an already unhashed
one.

Note also, that this loop:

t = task;
do {
...
t = next_thread(t);
} while (t != task);

requires that task->tgrp is stable (I mean, 'task' must not be
unhashed during this loop). And I can't see how this is possible
unless we take ->sighand or tasklist_lock or task == current.

Eric, I know nothing about proc/, so the question is: do you want
me to add _rcu right now, or do you think it's better to do this
later?

grepping for next_thread in proc/ I have the feeling that we can
convert the code from:

read_lock(tasklist);
if (pid_alive(task)) {
do ... while (t != task);
}
read_unlock(tasklist);

to:

rcu_read_lock();
if (lock_task_sighand(task)) {
...
unlock_task_sighand(task);
}
rcu_read_unlock();

But again, I don't understand this code.

> Also could we name the member not ->tgrp but ->threads?
> I keep half expecting tgrp to be a number, and have a hard
> time not mispelling it tpgrp.

Agreed. I also started with 'threads' name, but we already have
'struct thread_struct thread' in the task_struct. So may be
we could name it thread_group? I am rather agnostic and have nothing
against 'threads', will resend this patch after yours decision.

Oleg.

2006-03-08 17:14:10

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Oleg Nesterov <[email protected]> writes:

> Eric W. Biederman wrote:
>>
>> Oleg Nesterov <[email protected]> writes:
>>
>> > depends on
>> >
>> > pidhash-dont-count-idle-threads.patch
>> > pidhash-kill-switch_exec_pids.patch
>> >
>> > otherwise (I think) it is orthogonal to all tref/proc changes.
>>
>> You also depend on your recent change to call __unhash_process
>> under the sighand lock. Since the ->tgrp is protected by
>> the sighand lock.
>
> Actually now I think it depends only on yours
> pidhash-kill-switch_exec_pids.patch
>
> This change (call __unhash_process under the sighand lock) does not
> touch __unhash_process() itself, it only moves the callsite. So this
> patch can go before or after this change. However it needs other
> patches from -mm to avoid rejects.

Agreed. I misread the code and didn't see that we had the tasklist_lock where
you did the list_add.

>> > This patch kills PIDTYPE_TGID pid_type thus saving one hash table
>> > in kernel/pid.c and speeding up subthreads create/destroy a bit.
>> > It is also a preparation for the further tref/pids rework.
>> >
>> > This patch adds 'struct list_head tgrp' to 'struct task_struct'
>> > instead. Note that ->tgrp need not to be rcu safe.
>>
>> Is there a reason for this? I think at least proc could easily
>> take advantage of an rcu safe implementation.
>>
>> Hmm. At the moment proc is only taking the tasklist_lock during
>> traversal so we may have a problem with the list_add anyway.
>
> tasklist or ->sighand is enough to do threads traversal. Yes,
> adding _rcu suffix allows us to do it under rcu_read_lock().
> However the thread group will not be stable during this traversal,
> we can miss some newly created thread or hit an already unhashed
> one.

Right so if what we are doing is needs to be atomic we can't drop
the lock. Which makes sense.

> Note also, that this loop:
>
> t = task;
> do {
> ...
> t = next_thread(t);
> } while (t != task);
>
> requires that task->tgrp is stable (I mean, 'task' must not be
> unhashed during this loop). And I can't see how this is possible
> unless we take ->sighand or tasklist_lock or task == current.
>
> Eric, I know nothing about proc/, so the question is: do you want
> me to add _rcu right now, or do you think it's better to do this
> later?

Let's add the manipulations right now, that way we have the option
to convert the users that can take advantage of it.

> grepping for next_thread in proc/ I have the feeling that we can
> convert the code from:
>
> read_lock(tasklist);
> if (pid_alive(task)) {
> do ... while (t != task);
> }
> read_unlock(tasklist);
>
> to:
>
> rcu_read_lock();
> if (lock_task_sighand(task)) {
> ...
> unlock_task_sighand(task);
> }
> rcu_read_unlock();
>
> But again, I don't understand this code.

For proc we don't even need the lock_task_sighand. It is purely best
effort. The more we can remove lock contention from proc the harder it
will be for a user space application to trigger long lock hold
times.

>> Also could we name the member not ->tgrp but ->threads?
>> I keep half expecting tgrp to be a number, and have a hard
>> time not mispelling it tpgrp.
>
> Agreed. I also started with 'threads' name, but we already have
> 'struct thread_struct thread' in the task_struct. So may be
> we could name it thread_group? I am rather agnostic and have nothing
> against 'threads', will resend this patch after yours decision.

Ok. Let's go with thread_group. It isn't a frequently used field
so as long as the name doesn't get terribly long things will work,
and thread_group is still much shorter than it's current name :)

Eric

2006-03-08 19:27:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Changes:
s/tgrp/thread_group/

make it rcu-safe

[PATCH rc5-mm] pids: kill PIDTYPE_TGID

depends on pidhash-dont-count-idle-threads.patch

This patch kills PIDTYPE_TGID pid_type thus saving one hash table
in kernel/pid.c and speeding up subthreads create/destroy a bit.
It is also a preparation for the further tref/pids rework.

This patch adds 'struct list_head thread_group' to 'struct task_struct'
instead.

We don't detach group leader from PIDTYPE_PID namespace until another
thread inherits it's ->pid == ->tgid, so we are safe wrt premature
free_pidmap(->tgid) call.

Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
Should the need arise, we can use find_task_by_pid()->group_leader.

include/linux/pid.h | 1 -
include/linux/sched.h | 10 +++++++---
kernel/exit.c | 10 +---------
kernel/fork.c | 4 +++-
4 files changed, 11 insertions(+), 14 deletions(-)

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.16-rc5/include/linux/pid.h~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/include/linux/pid.h 2006-03-09 00:20:45.000000000 +0300
@@ -4,7 +4,6 @@
enum pid_type
{
PIDTYPE_PID,
- PIDTYPE_TGID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX
--- 2.6.16-rc5/include/linux/sched.h~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/include/linux/sched.h 2006-03-09 00:34:57.000000000 +0300
@@ -748,6 +748,7 @@ struct task_struct {

/* PID/PID hash table linkage. */
struct pid pids[PIDTYPE_MAX];
+ struct list_head thread_group;

struct completion *vfork_done; /* for vfork() */
int __user *set_child_tid; /* CLONE_CHILD_SETTID */
@@ -1181,13 +1182,17 @@ extern void wait_task_inactive(task_t *
#define while_each_thread(g, t) \
while ((t = next_thread(t)) != g)

-extern task_t * FASTCALL(next_thread(const task_t *p));
-
#define thread_group_leader(p) (p->pid == p->tgid)

+static inline task_t *next_thread(task_t *p)
+{
+ return list_entry(rcu_dereference(p->thread_group.next),
+ task_t, thread_group);
+}
+
static inline int thread_group_empty(task_t *p)
{
- return list_empty(&p->pids[PIDTYPE_TGID].pid_list);
+ return list_empty(&p->thread_group);
}

#define delay_group_leader(p) \
--- 2.6.16-rc5/kernel/exit.c~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/kernel/exit.c 2006-03-09 00:42:26.000000000 +0300
@@ -49,7 +49,6 @@ static void __unhash_process(struct task
{
nr_threads--;
detach_pid(p, PIDTYPE_PID);
- detach_pid(p, PIDTYPE_TGID);
if (thread_group_leader(p)) {
detach_pid(p, PIDTYPE_PGID);
detach_pid(p, PIDTYPE_SID);
@@ -57,7 +56,7 @@ static void __unhash_process(struct task
list_del_init(&p->tasks);
__get_cpu_var(process_counts)--;
}
-
+ list_del_rcu(&p->thread_group);
remove_parent(p);
}

@@ -953,13 +952,6 @@ asmlinkage long sys_exit(int error_code)
do_exit((error_code&0xff)<<8);
}

-task_t fastcall *next_thread(const task_t *p)
-{
- return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);
-}
-
-EXPORT_SYMBOL(next_thread);
-
/*
* Take down every thread in the group. This is called by fatal signals
* as well as by sys_exit_group (below).
--- 2.6.16-rc5/kernel/fork.c~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/kernel/fork.c 2006-03-09 00:41:35.000000000 +0300
@@ -1100,6 +1100,7 @@ static task_t *copy_process(unsigned lon
* We dont wake it up yet.
*/
p->group_leader = p;
+ INIT_LIST_HEAD(&p->thread_group);
INIT_LIST_HEAD(&p->ptrace_children);
INIT_LIST_HEAD(&p->ptrace_list);

@@ -1153,7 +1154,9 @@ static task_t *copy_process(unsigned lon
retval = -EAGAIN;
goto bad_fork_cleanup_namespace;
}
+
p->group_leader = current->group_leader;
+ list_add_tail_rcu(&p->thread_group, &current->thread_group);

if (current->signal->group_stop_count > 0) {
/*
@@ -1201,7 +1204,6 @@ static task_t *copy_process(unsigned lon
list_add_tail(&p->tasks, &init_task.tasks);
__get_cpu_var(process_counts)++;
}
- attach_pid(p, PIDTYPE_TGID, p->tgid);
attach_pid(p, PIDTYPE_PID, p->pid);
nr_threads++;
}

2006-03-08 20:01:56

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Oleg Nesterov <[email protected]> writes:

> Changes:
> s/tgrp/thread_group/
>
> make it rcu-safe
>
> [PATCH rc5-mm] pids: kill PIDTYPE_TGID
>
> depends on pidhash-dont-count-idle-threads.patch
>
> This patch kills PIDTYPE_TGID pid_type thus saving one hash table
> in kernel/pid.c and speeding up subthreads create/destroy a bit.
> It is also a preparation for the further tref/pids rework.
>
> This patch adds 'struct list_head thread_group' to 'struct task_struct'
> instead.
>
> We don't detach group leader from PIDTYPE_PID namespace until another
> thread inherits it's ->pid == ->tgid, so we are safe wrt premature
> free_pidmap(->tgid) call.
>
> Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
> Should the need arise, we can use find_task_by_pid()->group_leader.

Looks good. I have one final nit.

> --- 2.6.16-rc5/kernel/fork.c~1_TGID 2006-03-09 00:17:24.000000000 +0300
> +++ 2.6.16-rc5/kernel/fork.c 2006-03-09 00:41:35.000000000 +0300
> @@ -1100,6 +1100,7 @@ static task_t *copy_process(unsigned lon
> * We dont wake it up yet.
> */
> p->group_leader = p;
> + INIT_LIST_HEAD(&p->thread_group);
> INIT_LIST_HEAD(&p->ptrace_children);
> INIT_LIST_HEAD(&p->ptrace_list);
>
> @@ -1153,7 +1154,9 @@ static task_t *copy_process(unsigned lon
> retval = -EAGAIN;
> goto bad_fork_cleanup_namespace;
> }
> +
> p->group_leader = current->group_leader;
> + list_add_tail_rcu(&p->thread_group, &current->thread_group);
Can this be:
list_add_tail_rcu(&p->thread_group, &current->group_leader->thread_group);

That way at least the odds of missing a new task_struct when doing an
rcu traversal are reduced almost to 0.

>
> if (current->signal->group_stop_count > 0) {
> /*
> @@ -1201,7 +1204,6 @@ static task_t *copy_process(unsigned lon
> list_add_tail(&p->tasks, &init_task.tasks);
> __get_cpu_var(process_counts)++;
> }
> - attach_pid(p, PIDTYPE_TGID, p->tgid);
> attach_pid(p, PIDTYPE_PID, p->pid);
> nr_threads++;
> }


Eric

2006-03-08 21:49:27

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Eric W. Biederman wrote:
>
> > p->group_leader = current->group_leader;
> > + list_add_tail_rcu(&p->thread_group, &current->thread_group);
> Can this be:
> list_add_tail_rcu(&p->thread_group, &current->group_leader->thread_group);

Done.

> That way at least the odds of missing a new task_struct when doing an
> rcu traversal are reduced almost to 0.

Am I understand correctly? This change has effect when we are doing the
traversal starting from ->group_leader in a "best effort" manner lockless,
yes?

[PATCH rc5-mm] pids: kill PIDTYPE_TGID

depends on pidhash-dont-count-idle-threads.patch

This patch kills PIDTYPE_TGID pid_type thus saving one hash table
in kernel/pid.c and speeding up subthreads create/destroy a bit.
It is also a preparation for the further tref/pids rework.

This patch adds 'struct list_head thread_group' to 'struct task_struct'
instead.

We don't detach group leader from PIDTYPE_PID namespace until another
thread inherits it's ->pid == ->tgid, so we are safe wrt premature
free_pidmap(->tgid) call.

Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
Should the need arise, we can use find_task_by_pid()->group_leader.

include/linux/pid.h | 1 -
include/linux/sched.h | 11 ++++++++---
kernel/exit.c | 10 +---------
kernel/fork.c | 4 +++-
4 files changed, 12 insertions(+), 14 deletions(-)

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.16-rc5/include/linux/pid.h~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/include/linux/pid.h 2006-03-09 03:03:44.000000000 +0300
@@ -4,7 +4,6 @@
enum pid_type
{
PIDTYPE_PID,
- PIDTYPE_TGID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX
--- 2.6.16-rc5/include/linux/sched.h~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/include/linux/sched.h 2006-03-09 03:03:44.000000000 +0300
@@ -748,6 +748,7 @@ struct task_struct {

/* PID/PID hash table linkage. */
struct pid pids[PIDTYPE_MAX];
+ struct list_head thread_group;

struct completion *vfork_done; /* for vfork() */
int __user *set_child_tid; /* CLONE_CHILD_SETTID */
@@ -1181,13 +1182,17 @@ extern void wait_task_inactive(task_t *
#define while_each_thread(g, t) \
while ((t = next_thread(t)) != g)

-extern task_t * FASTCALL(next_thread(const task_t *p));
-
#define thread_group_leader(p) (p->pid == p->tgid)

+static inline task_t *next_thread(task_t *p)
+{
+ return list_entry(rcu_dereference(p->thread_group.next),
+ task_t, thread_group);
+}
+
static inline int thread_group_empty(task_t *p)
{
- return list_empty(&p->pids[PIDTYPE_TGID].pid_list);
+ return list_empty(&p->thread_group);
}

#define delay_group_leader(p) \
--- 2.6.16-rc5/kernel/exit.c~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/kernel/exit.c 2006-03-09 00:42:26.000000000 +0300
@@ -49,7 +49,6 @@ static void __unhash_process(struct task
{
nr_threads--;
detach_pid(p, PIDTYPE_PID);
- detach_pid(p, PIDTYPE_TGID);
if (thread_group_leader(p)) {
detach_pid(p, PIDTYPE_PGID);
detach_pid(p, PIDTYPE_SID);
@@ -57,7 +56,7 @@ static void __unhash_process(struct task
list_del_init(&p->tasks);
__get_cpu_var(process_counts)--;
}
-
+ list_del_rcu(&p->thread_group);
remove_parent(p);
}

@@ -953,13 +952,6 @@ asmlinkage long sys_exit(int error_code)
do_exit((error_code&0xff)<<8);
}

-task_t fastcall *next_thread(const task_t *p)
-{
- return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);
-}
-
-EXPORT_SYMBOL(next_thread);
-
/*
* Take down every thread in the group. This is called by fatal signals
* as well as by sys_exit_group (below).
--- 2.6.16-rc5/kernel/fork.c~1_TGID 2006-03-09 00:17:24.000000000 +0300
+++ 2.6.16-rc5/kernel/fork.c 2006-03-09 03:05:10.000000000 +0300
@@ -1100,6 +1100,7 @@ static task_t *copy_process(unsigned lon
* We dont wake it up yet.
*/
p->group_leader = p;
+ INIT_LIST_HEAD(&p->thread_group);
INIT_LIST_HEAD(&p->ptrace_children);
INIT_LIST_HEAD(&p->ptrace_list);

@@ -1153,7 +1154,9 @@ static task_t *copy_process(unsigned lon
retval = -EAGAIN;
goto bad_fork_cleanup_namespace;
}
+
p->group_leader = current->group_leader;
+ list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);

if (current->signal->group_stop_count > 0) {
/*
@@ -1201,7 +1204,6 @@ static task_t *copy_process(unsigned lon
list_add_tail(&p->tasks, &init_task.tasks);
__get_cpu_var(process_counts)++;
}
- attach_pid(p, PIDTYPE_TGID, p->tgid);
attach_pid(p, PIDTYPE_PID, p->pid);
nr_threads++;
}

2006-03-08 22:14:12

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH rc5-mm] pids: kill PIDTYPE_TGID

Oleg Nesterov <[email protected]> writes:

> Eric W. Biederman wrote:
>>
>> > p->group_leader = current->group_leader;
>> > + list_add_tail_rcu(&p->thread_group, &current->thread_group);
>> Can this be:
>> list_add_tail_rcu(&p->thread_group, &current->group_leader->thread_group);
>
> Done.
>
>> That way at least the odds of missing a new task_struct when doing an
>> rcu traversal are reduced almost to 0.
>
> Am I understand correctly? This change has effect when we are doing the
> traversal starting from ->group_leader in a "best effort" manner lockless,
> yes?

Yes. The important point for reasoning about it is that we have a fixed
point that we append or prepend to.

I think I have finally figured out the invariants we will need to
send signals using just the rcu_read_lock(). In that case we want to
traverse the list in such a way that we are guaranteed to never see
new entries. That gives us an atomic snapshot of the tasks to send
signals to, and it gives us a progress guarantee.

Now I need to look and see if any one has documented any rules
regarding atomicity of sending signals to a group of processes.


> [PATCH rc5-mm] pids: kill PIDTYPE_TGID
>
> depends on pidhash-dont-count-idle-threads.patch
>
> This patch kills PIDTYPE_TGID pid_type thus saving one hash table
> in kernel/pid.c and speeding up subthreads create/destroy a bit.
> It is also a preparation for the further tref/pids rework.
>
> This patch adds 'struct list_head thread_group' to 'struct task_struct'
> instead.
>
> We don't detach group leader from PIDTYPE_PID namespace until another
> thread inherits it's ->pid == ->tgid, so we are safe wrt premature
> free_pidmap(->tgid) call.
>
> Currently there are no users of find_task_by_pid_type(PIDTYPE_TGID).
> Should the need arise, we can use find_task_by_pid()->group_leader.
>
> include/linux/pid.h | 1 -
> include/linux/sched.h | 11 ++++++++---
> kernel/exit.c | 10 +---------
> kernel/fork.c | 4 +++-
> 4 files changed, 12 insertions(+), 14 deletions(-)
>
> Signed-off-by: Oleg Nesterov <[email protected]>

Acked-By: Eric Biederman <[email protected]>