2010-11-02 13:58:38

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.

Tetsuo Handa wrote:

Quoting from one of posts in that thead
http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388

| Usually tasklist gives enough protection, but if copy_process() fails
| it calls free_pid() lockless and does call_rcu(delayed_put_pid().
| This means, without rcu lock find_pid_ns() can't scan the hash table
| safely.


Signed-off-by: Sergey Senozhatsky <[email protected]>

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..855bc53 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -38,11 +38,13 @@ static int check_clock(const clockid_t which_clock)
return 0;

read_lock(&tasklist_lock);
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
same_thread_group(p, current) : thread_group_leader(p))) {
error = -EINVAL;
}
+ rcu_read_unlock();
read_unlock(&tasklist_lock);

return error;
@@ -395,17 +397,21 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
if (pid == 0) {
p = current;
} else {
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (p && !same_thread_group(p, current))
p = NULL;
+ rcu_read_unlock();
}
} else {
if (pid == 0) {
p = current->group_leader;
} else {
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (p && !thread_group_leader(p))
p = NULL;
+ rcu_read_unlock();
}
}
new_timer->it.cpu.task = p;


2010-11-02 15:32:11

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:

> Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
>
> Tetsuo Handa wrote:
>
> Quoting from one of posts in that thead
> http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
>
> | Usually tasklist gives enough protection, but if copy_process() fails
> | it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> | This means, without rcu lock find_pid_ns() can't scan the hash table
> | safely.

We can remove the tasklist_lock while at it. rcu_read_lock is enough.

Thanks,

tglx

>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
>
> ---
>
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..855bc53 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -38,11 +38,13 @@ static int check_clock(const clockid_t which_clock)
> return 0;
>
> read_lock(&tasklist_lock);
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> same_thread_group(p, current) : thread_group_leader(p))) {
> error = -EINVAL;
> }
> + rcu_read_unlock();
> read_unlock(&tasklist_lock);
>
> return error;
> @@ -395,17 +397,21 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> if (pid == 0) {
> p = current;
> } else {
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (p && !same_thread_group(p, current))
> p = NULL;
> + rcu_read_unlock();
> }
> } else {
> if (pid == 0) {
> p = current->group_leader;
> } else {
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (p && !thread_group_leader(p))
> p = NULL;
> + rcu_read_unlock();
> }
> }
> new_timer->it.cpu.task = p;
>

2010-11-02 16:02:34

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On (11/02/10 16:31), Thomas Gleixner wrote:
> On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
>
> > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> >
> > Tetsuo Handa wrote:
> >
> > Quoting from one of posts in that thead
> > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> >
> > | Usually tasklist gives enough protection, but if copy_process() fails
> > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > | safely.
>
> We can remove the tasklist_lock while at it. rcu_read_lock is enough.
>

Please kindly review.

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..96fe2a3 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
if (pid == 0)
return 0;

- read_lock(&tasklist_lock);
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
same_thread_group(p, current) : thread_group_leader(p))) {
error = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return error;
}
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)

INIT_LIST_HEAD(&new_timer->it.cpu.entry);

- read_lock(&tasklist_lock);
+ rcu_read_lock();
if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
if (pid == 0) {
p = current;
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
} else {
ret = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return ret;
}


Attachments:
(No filename) (1.75 kB)
(No filename) (316.00 B)
Download all attachments

2010-11-02 16:05:21

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

Just added Oleg to the cc-list

On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:

> On (11/02/10 16:31), Thomas Gleixner wrote:
> > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> >
> > > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> > >
> > > Tetsuo Handa wrote:
> > >
> > > Quoting from one of posts in that thead
> > > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> > >
> > > | Usually tasklist gives enough protection, but if copy_process() fails
> > > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> > > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > > | safely.
> >
> > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> >
>
> Please kindly review.
>
> ---
>
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..96fe2a3 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
> if (pid == 0)
> return 0;
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> same_thread_group(p, current) : thread_group_leader(p))) {
> error = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return error;
> }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>
> INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
> if (pid == 0) {
> p = current;
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> } else {
> ret = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return ret;
> }
>
>

2010-11-02 18:40:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

> On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
>
> > On (11/02/10 16:31), Thomas Gleixner wrote:
> > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > >
> > > > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > > > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > > > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> > > >
> > > > Tetsuo Handa wrote:
> > > >
> > > > Quoting from one of posts in that thead
> > > > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> > > >
> > > > | Usually tasklist gives enough protection, but if copy_process() fails
> > > > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> > > > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > > > | safely.
> > >
> > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > >

Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
but it is not trivial to change this code.

Minor nit,

> > @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> >
> > INIT_LIST_HEAD(&new_timer->it.cpu.entry);
> >
> > - read_lock(&tasklist_lock);
> > + rcu_read_lock();
> > if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
> > if (pid == 0) {
> > p = current;
> > @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> > } else {
> > ret = -EINVAL;
> > }
> > - read_unlock(&tasklist_lock);
> > + rcu_read_unlock();

I think this change is fine, but please note that thread_group_leader()
check is not relaible without tasklist. If we race with de_thread()
find_task_by_vpid() can find the new leader before it updates its
->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.

Not that I think this really matters, posix_cpu_timer_create() has
other problems with de_thread(). But perhaps it makes sense to
change posix_cpu_timer_create() to use has_group_leader_pid() instead,
just to make this code not look racy and avoid adding new problems.

The real fix, I think, should change cpu_timer_list to use
struct pid* instead of task_struct.

Oleg.

2010-11-03 10:58:28

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On (11/02/10 19:33), Oleg Nesterov wrote:
> > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> >
> > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > >
>
> Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> but it is not trivial to change this code.
>
>[..]
>
> I think this change is fine, but please note that thread_group_leader()
> check is not relaible without tasklist. If we race with de_thread()
> find_task_by_vpid() can find the new leader before it updates its
> ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
>
> Not that I think this really matters, posix_cpu_timer_create() has
> other problems with de_thread(). But perhaps it makes sense to
> change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> just to make this code not look racy and avoid adding new problems.
>
> The real fix, I think, should change cpu_timer_list to use
> struct pid* instead of task_struct.
>

Hello,
Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
is not aquired (check_clock and posix_cpu_timer_create).

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..05bb717 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
if (pid == 0)
return 0;

- read_lock(&tasklist_lock);
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
- same_thread_group(p, current) : thread_group_leader(p))) {
+ same_thread_group(p, current) : has_group_leader_pid(p))) {
error = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return error;
}
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)

INIT_LIST_HEAD(&new_timer->it.cpu.entry);

- read_lock(&tasklist_lock);
+ rcu_read_lock();
if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
if (pid == 0) {
p = current;
@@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
p = current->group_leader;
} else {
p = find_task_by_vpid(pid);
- if (p && !thread_group_leader(p))
+ if (p && !has_group_leader_pid(p))
p = NULL;
}
}
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
} else {
ret = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return ret;
}



Attachments:
(No filename) (2.38 kB)
(No filename) (316.00 B)
Download all attachments

2010-11-03 12:55:10

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

Damn.

Sergey, Thomas, please wait a bit.

Yes, I believe this patch is fine by itself. But looking into
posix-cpu-timers.c again, I suspect that those "other problems
with de_thread" I already mentioned are much more serious and
need the urgent fix.

I'll try to verify this a bit later today.

In any case, I believe someone should find the time to audit/
rewrite posix-cpu-timers.c ;)

On 11/03, Sergey Senozhatsky wrote:
>
> On (11/02/10 19:33), Oleg Nesterov wrote:
> > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > >
> > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > >
> >
> > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > but it is not trivial to change this code.
> >
> >[..]
> >
> > I think this change is fine, but please note that thread_group_leader()
> > check is not relaible without tasklist. If we race with de_thread()
> > find_task_by_vpid() can find the new leader before it updates its
> > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> >
> > Not that I think this really matters, posix_cpu_timer_create() has
> > other problems with de_thread(). But perhaps it makes sense to
> > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > just to make this code not look racy and avoid adding new problems.
> >
> > The real fix, I think, should change cpu_timer_list to use
> > struct pid* instead of task_struct.
> >
>
> Hello,
> Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> is not aquired (check_clock and posix_cpu_timer_create).
>
> ---
>
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..05bb717 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
> if (pid == 0)
> return 0;
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> - same_thread_group(p, current) : thread_group_leader(p))) {
> + same_thread_group(p, current) : has_group_leader_pid(p))) {
> error = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return error;
> }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>
> INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
> if (pid == 0) {
> p = current;
> @@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> p = current->group_leader;
> } else {
> p = find_task_by_vpid(pid);
> - if (p && !thread_group_leader(p))
> + if (p && !has_group_leader_pid(p))
> p = NULL;
> }
> }
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> } else {
> ret = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return ret;
> }
>
>


2010-11-03 16:17:30

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On 11/03, Oleg Nesterov wrote:
>
> Sergey, Thomas, please wait a bit.
>
> Yes, I believe this patch is fine by itself. But looking into
> posix-cpu-timers.c again, I suspect that those "other problems
> with de_thread" I already mentioned are much more serious and
> need the urgent fix.

Yes. I'll send the patch tomorrow.


However, my initial thinking was wrong, that bug is orthogonal
to this patch.

Sergey, how much will you hate me if I ask you to re-send it again?


> > On (11/02/10 19:33), Oleg Nesterov wrote:
> > > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > > >
> > > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > > >
> > >
> > > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > > but it is not trivial to change this code.
> > >
> > >[..]
> > >
> > > I think this change is fine, but please note that thread_group_leader()
> > > check is not relaible without tasklist. If we race with de_thread()
> > > find_task_by_vpid() can find the new leader before it updates its
> > > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> > >
> > > Not that I think this really matters, posix_cpu_timer_create() has
> > > other problems with de_thread(). But perhaps it makes sense to
> > > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > > just to make this code not look racy and avoid adding new problems.
> > >
> > > The real fix, I think, should change cpu_timer_list to use
> > > struct pid* instead of task_struct.
> > >
> >
> > Hello,
> > Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> > is not aquired (check_clock and posix_cpu_timer_create).

This doesn't look like a valid changelog ;) I'd suggest you to write
the new one without quoting old emails. It should explain that a)
tasklist_lock is not enough for find_vpid() and b) it is not needed.

Also, you forgot to add your signed-of-by.

Otherwise,

Reviewed-by: Oleg Nesterov <[email protected]>

2010-11-03 16:38:13

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On (11/03/10 17:10), Oleg Nesterov wrote:
> On 11/03, Oleg Nesterov wrote:
> >
> > Sergey, Thomas, please wait a bit.
> >
> > Yes, I believe this patch is fine by itself. But looking into
> > posix-cpu-timers.c again, I suspect that those "other problems
> > with de_thread" I already mentioned are much more serious and
> > need the urgent fix.
>
> Yes. I'll send the patch tomorrow.
>
>
> However, my initial thinking was wrong, that bug is orthogonal
> to this patch.
>
> Sergey, how much will you hate me if I ask you to re-send it again?
>

Oleg, not a problem at all. Which one should I resend?

# desc.
1) added rcu_read_lock/unlock
2) removed tasklist_lock
3) added has_group_leader_pid


Sergey


>
> > > On (11/02/10 19:33), Oleg Nesterov wrote:
> > > > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > > > >
> > > > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > > > >
> > > >
> > > > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > > > but it is not trivial to change this code.
> > > >
> > > >[..]
> > > >
> > > > I think this change is fine, but please note that thread_group_leader()
> > > > check is not relaible without tasklist. If we race with de_thread()
> > > > find_task_by_vpid() can find the new leader before it updates its
> > > > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> > > >
> > > > Not that I think this really matters, posix_cpu_timer_create() has
> > > > other problems with de_thread(). But perhaps it makes sense to
> > > > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > > > just to make this code not look racy and avoid adding new problems.
> > > >
> > > > The real fix, I think, should change cpu_timer_list to use
> > > > struct pid* instead of task_struct.
> > > >
> > >
> > > Hello,
> > > Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> > > is not aquired (check_clock and posix_cpu_timer_create).
>
> This doesn't look like a valid changelog ;) I'd suggest you to write
> the new one without quoting old emails. It should explain that a)
> tasklist_lock is not enough for find_vpid() and b) it is not needed.
>
> Also, you forgot to add your signed-of-by.
>
> Otherwise,
>
> Reviewed-by: Oleg Nesterov <[email protected]>
>


Attachments:
(No filename) (2.28 kB)
(No filename) (316.00 B)
Download all attachments

2010-11-03 16:52:53

by Sergey Senozhatsky

[permalink] [raw]
Subject: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.

Tetsuo Handa wrote:
| Quoting from one of posts in that thead
| http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
|
|| Usually tasklist gives enough protection, but if copy_process() fails
|| it calls free_pid() lockless and does call_rcu(delayed_put_pid().
|| This means, without rcu lock find_pid_ns() can't scan the hash table
|| safely.

Thomas Gleixner wrote:
| We can remove the tasklist_lock while at it. rcu_read_lock is enough.

Patch also replaces thread_group_leader with has_group_leader_pid
in accordance to comment by Oleg Nesterov:

| ... thread_group_leader() check is not relaible without
| tasklist. If we race with de_thread() find_task_by_vpid() can find
| the new leader before it updates its ->group_leader.
|
| perhaps it makes sense to change posix_cpu_timer_create() to use
| has_group_leader_pid() instead, just to make this code not look racy
| and avoid adding new problems.


Signed-off-by: Sergey Senozhatsky <[email protected]>

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..05bb717 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
if (pid == 0)
return 0;

- read_lock(&tasklist_lock);
+ rcu_read_lock();
p = find_task_by_vpid(pid);
if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
- same_thread_group(p, current) : thread_group_leader(p))) {
+ same_thread_group(p, current) : has_group_leader_pid(p))) {
error = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return error;
}
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)

INIT_LIST_HEAD(&new_timer->it.cpu.entry);

- read_lock(&tasklist_lock);
+ rcu_read_lock();
if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
if (pid == 0) {
p = current;
@@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
p = current->group_leader;
} else {
p = find_task_by_vpid(pid);
- if (p && !thread_group_leader(p))
+ if (p && !has_group_leader_pid(p))
p = NULL;
}
}
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
} else {
ret = -EINVAL;
}
- read_unlock(&tasklist_lock);
+ rcu_read_unlock();

return ret;
}

2010-11-03 17:24:18

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call

On 11/03, Sergey Senozhatsky wrote:
>
> Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
>
> Tetsuo Handa wrote:
> | Quoting from one of posts in that thead
> | http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> |
> || Usually tasklist gives enough protection, but if copy_process() fails
> || it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> || This means, without rcu lock find_pid_ns() can't scan the hash table
> || safely.
>
> Thomas Gleixner wrote:
> | We can remove the tasklist_lock while at it. rcu_read_lock is enough.
>
> Patch also replaces thread_group_leader with has_group_leader_pid
> in accordance to comment by Oleg Nesterov:
>
> | ... thread_group_leader() check is not relaible without
> | tasklist. If we race with de_thread() find_task_by_vpid() can find
> | the new leader before it updates its ->group_leader.
> |
> | perhaps it makes sense to change posix_cpu_timer_create() to use
> | has_group_leader_pid() instead, just to make this code not look racy
> | and avoid adding new problems.
>
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>

Reviewed-by: Oleg Nesterov <[email protected]>


> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..05bb717 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
> if (pid == 0)
> return 0;
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> p = find_task_by_vpid(pid);
> if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> - same_thread_group(p, current) : thread_group_leader(p))) {
> + same_thread_group(p, current) : has_group_leader_pid(p))) {
> error = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return error;
> }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>
> INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>
> - read_lock(&tasklist_lock);
> + rcu_read_lock();
> if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
> if (pid == 0) {
> p = current;
> @@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> p = current->group_leader;
> } else {
> p = find_task_by_vpid(pid);
> - if (p && !thread_group_leader(p))
> + if (p && !has_group_leader_pid(p))
> p = NULL;
> }
> }
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> } else {
> ret = -EINVAL;
> }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>
> return ret;
> }
>

2010-11-05 16:00:44

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec

posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal->cpu_timers list.

But, it also assumes that timer->it.cpu.task is always the group
leader, and thus the dead ->task means the dead thread group.

This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.

It is not simple to fix this bug correctly. First of all, I think
that timer->it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.

Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.

In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().

Signed-off-by: Oleg Nesterov <[email protected]>
---

kernel/exit.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- kstub/kernel/exit.c~pct_de_thread_race 2010-08-17 12:32:24.000000000 +0200
+++ kstub/kernel/exit.c 2010-11-04 21:30:18.000000000 +0100
@@ -95,6 +95,14 @@ static void __exit_signal(struct task_st
sig->tty = NULL;
} else {
/*
+ * This can only happen if the caller is de_thread().
+ * FIXME: this is the temporary hack, we should teach
+ * posix-cpu-timers to handle this case correctly.
+ */
+ if (unlikely(has_group_leader_pid(tsk)))
+ posix_cpu_timers_exit_group(tsk);
+
+ /*
* If there is any task waiting for the group exit
* then notify it:
*/

2010-11-08 18:15:46

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec

That seems like a reasonable workaround off hand. But since all the
reworkings I am no longer particularly authoritative on this code.


Thanks,
Roland

2010-11-09 14:51:39

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec

On Fri, Nov 05, 2010 at 04:53:42PM +0100, Oleg Nesterov wrote:
> posix-cpu-timers.c correctly assumes that the dying process does
> posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
> timers from signal->cpu_timers list.
>
> But, it also assumes that timer->it.cpu.task is always the group
> leader, and thus the dead ->task means the dead thread group.
>
> This is obviously not true after de_thread() changes the leader.
> After that almost every posix_cpu_timer_ method has problems.
>
> It is not simple to fix this bug correctly. First of all, I think
> that timer->it.cpu should use struct pid instead of task_struct.
> Also, the locking should be reworked completely. In particular,
> tasklist_lock should not be used at all. This all needs a lot of
> nontrivial and hard-to-test changes.
>
> Change __exit_signal() to do posix_cpu_timers_exit_group() when
> the old leader dies during exec. This is not the fix, just the
> temporary hack to hide the problem for 2.6.37 and stable. IOW,
> this is obviously wrong but this is what we currently have anyway:
> cpu timers do not work after mt exec.
>
> In theory this change adds another race. The exiting leader can
> detach the timers which were attached to the new leader. However,
> the window between de_thread() and release_task() is small, we
> can pretend that sys_timer_create() was called before de_thread().
>
> Signed-off-by: Oleg Nesterov <[email protected]>

Reviewed-by: Stanislaw Gruszka <[email protected]>