2008-03-04 18:58:31

by Oleg Nesterov

[permalink] [raw]
Subject: [RFC,PATCH 2/2] kill_pid_info: don't take now unneeded tasklist_lock

Previously handle_stop_signal(SIGCONT) could drop ->siglock. That is why
kill_pid_info(SIGCONT) takes tasklist_lock to make sure the target task
can't go away after unlock. Not needed now.

Signed-off-by: Oleg Nesterov <[email protected]>

--- 25/include/linux/signal.h~1__KILL_NO_TASKLIST 2008-02-15 16:59:17.000000000 +0300
+++ 25/include/linux/signal.h 2008-03-04 21:43:25.000000000 +0300
@@ -362,8 +362,6 @@ int unhandled_signal(struct task_struct
#define sig_kernel_stop(sig) \
(((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_STOP_MASK))

-#define sig_needs_tasklist(sig) ((sig) == SIGCONT)
-
#define sig_user_defined(t, signr) \
(((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_DFL) && \
((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_IGN))
--- 25/kernel/signal.c~1__KILL_NO_TASKLIST 2008-03-04 21:24:20.000000000 +0300
+++ 25/kernel/signal.c 2008-03-04 21:44:58.000000000 +0300
@@ -1030,9 +1030,6 @@ int kill_pid_info(int sig, struct siginf
struct task_struct *p;

rcu_read_lock();
- if (unlikely(sig_needs_tasklist(sig)))
- read_lock(&tasklist_lock);
-
retry:
p = pid_task(pid, PIDTYPE_PID);
if (p) {
@@ -1046,10 +1043,8 @@ retry:
*/
goto retry;
}
-
- if (unlikely(sig_needs_tasklist(sig)))
- read_unlock(&tasklist_lock);
rcu_read_unlock();
+
return error;
}


2008-03-06 10:59:53

by Roland McGrath

[permalink] [raw]
Subject: Re: [RFC,PATCH 2/2] kill_pid_info: don't take now unneeded tasklist_lock

A nice payoff for the earlier cleanups.
Maybe it even improves parallelism for a real workload someday.

Thanks,
Roland

Subject: Re: [RFC,PATCH 2/2] kill_pid_info: don't take now unneeded tasklist_lock

(2008/03/05 3:57), Oleg Nesterov wrote:
> Previously handle_stop_signal(SIGCONT) could drop ->siglock. That is why
> kill_pid_info(SIGCONT) takes tasklist_lock to make sure the target task
> can't go away after unlock. Not needed now.
>
> Signed-off-by: Oleg Nesterov <[email protected]>

Hi Oleg,

I tried your patches on vanila kernel 2.6.25-rc3 (ia64). Then, I got
the NULL pointer dereference at task_session_nr(t) in
check_kill_permission(). That is why t->signal->__session is accessed
after t->signal was released. It is reproducible by sending many
SIGCONT signals to exiting processes.

The trace is as follows:

Pid: 7807, CPU 9, comm: kill_sigcont
psr : 00001210085a6010 ifs : 800000000000030a ip : [<a0000001000ab441>] Not
tainted (2.6.25-rc3-debug)
ip is at check_kill_permission+0x181/0x2a0
unat: 0000000000000000 pfs : 000000000000038a rsc : 0000000000000003
rnat: 0000000000000206 bsps: 0000000000000003 pr : 000000000056a999
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a0000001000ad510 b6 : a00000010000eb80 b7 : a00000010000eb50
f6 : 000000000000000000000 f7 : 000000000000000000000
f8 : 000000000000000000000 f9 : 000000000000000000000
f10 : 000000000000000000000 f11 : 000000000000000000000
r1 : a000000100cc6a70 r2 : 0000000000000000 r3 : e000001f43f07dc8
r8 : a000000100add08c r9 : a000000100add08c r10 : 0000000000003dc4
r11 : e000001f49d901e4 r12 : e000001f43f07da0 r13 : e000001f43f00000
r14 : 0000000000000012 r15 : 0000000000000000 r16 : a000000100add0a8
r17 : a000000100add0a8 r18 : fb00000000000000 r19 : d800000000000000
r20 : e000001f4493de48 r21 : 0000000000001df6 r22 : 0000000000000100
r23 : e000001f424c5280 r24 : 0000000000000000 r25 : e000001f424c5180
r26 : e000001f49d90b08 r27 : e000001f43f00b08 r28 : 0000000000003dc4
r29 : a000000100adcec0 r30 : a000000100a73a28 r31 : e000001f4493de40

Call Trace:
[<a000000100014440>] show_stack+0x40/0xa0
sp=e000001f43f077f0 bsp=e000001f43f01060
[<a000000100014d50>] show_regs+0x850/0x8a0
sp=e000001f43f079c0 bsp=e000001f43f01008
[<a0000001000360b0>] die+0x1b0/0x2c0
sp=e000001f43f079c0 bsp=e000001f43f00fb8
[<a000000100036210>] die_if_kernel+0x50/0x80
sp=e000001f43f079c0 bsp=e000001f43f00f88
[<a0000001005ce7e0>] ia64_fault+0x11a0/0x12c0
sp=e000001f43f079c0 bsp=e000001f43f00f30
[<a00000010000ab00>] ia64_leave_kernel+0x0/0x270
sp=e000001f43f07bd0 bsp=e000001f43f00f30
[<a0000001000ab440>] check_kill_permission+0x180/0x2a0
sp=e000001f43f07da0 bsp=e000001f43f00ee0
[<a0000001000ad510>] group_send_sig_info+0x30/0x100
sp=e000001f43f07da0 bsp=e000001f43f00ea8
[<a0000001000ad620>] kill_pid_info+0x40/0x80
sp=e000001f43f07db0 bsp=e000001f43f00e70
[<a0000001000adb30>] sys_kill+0xd0/0x2e0
sp=e000001f43f07db0 bsp=e000001f43f00df0
[<a00000010000a960>] ia64_ret_from_syscall+0x0/0x20
sp=e000001f43f07e30 bsp=e000001f43f00df0
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
sp=e000001f43f08000 bsp=e000001f43f00df0

Thanks,
-Atsushi Tsuji.

2008-03-17 16:57:29

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [RFC,PATCH 2/2] kill_pid_info: don't take now unneeded tasklist_lock

On 03/17, Atsushi Tsuji wrote:
>
> (2008/03/05 3:57), Oleg Nesterov wrote:
> >Previously handle_stop_signal(SIGCONT) could drop ->siglock. That is why
> >kill_pid_info(SIGCONT) takes tasklist_lock to make sure the target task
> >can't go away after unlock. Not needed now.
> >
> >Signed-off-by: Oleg Nesterov <[email protected]>
>
> Hi Oleg,
>
> I tried your patches on vanila kernel 2.6.25-rc3 (ia64). Then, I got
> the NULL pointer dereference at task_session_nr(t) in
> check_kill_permission(). That is why t->signal->__session is accessed
> after t->signal was released. It is reproducible by sending many
> SIGCONT signals to exiting processes.

Ah. Indeed!!! Thanks a lot Atsushi.

Note that check_kill_permission() is the last user of the deprecated
signal->__session/session, I was going to change this code later, but
missed the issue you pointed out.

I'll make the patch tomorrow.

Thanks!

Oleg.

Subject: Re: [PATCH] signals: check_kill_permission: check session under tasklist_lock

Oleg Nesterov wrote:
> (on top of signals-cleanup-security_task_kill-usage-implementation.patch)
>
> This wasn't documented, but as Atsushi Tsuji <[email protected]> pointed
> out check_kill_permission() needs tasklist_lock for task_session_nr().
> I missed this fact when removed tasklist from the callers.
>
> Change check_kill_permission() to take tasklist_lock for the SIGCONT case.
> Re-order security checks so that we take tasklist_lock only if/when it is
> actually needed. This is a minimal fix for now, tasklist will be removed
> later.

Thanks, I confirmed the problem is fixed by this patch.

>
> Also change the code to use task_session() instead of task_session_nr().
>
> Also, remove the SIGCONT check from cap_task_kill(), it is bogus (and the
> whole function is bogus. Serge, Eric, why it is still alive?).
>
> Signed-off-by: Oleg Nesterov <[email protected]>

Acked-by: Atsushi Tsuji <[email protected]>

>
> --- 25/kernel/signal.c~CKP_TAKE_TASKLIST 2008-03-18 14:47:00.000000000 +0300
> +++ 25/kernel/signal.c 2008-03-18 17:25:19.000000000 +0300
> @@ -533,6 +533,7 @@ static int rm_from_queue(unsigned long m
> static int check_kill_permission(int sig, struct siginfo *info,
> struct task_struct *t)
> {
> + struct pid *sid;
> int error;
>
> if (!valid_signal(sig))
> @@ -545,11 +546,24 @@ static int check_kill_permission(int sig
> if (error)
> return error;
>
> - if (((sig != SIGCONT) || (task_session_nr(current) != task_session_nr(t)))
> - && (current->euid ^ t->suid) && (current->euid ^ t->uid)
> - && (current->uid ^ t->suid) && (current->uid ^ t->uid)
> - && !capable(CAP_KILL))
> - return -EPERM;
> + if ((current->euid ^ t->suid) && (current->euid ^ t->uid) &&
> + (current->uid ^ t->suid) && (current->uid ^ t->uid) &&
> + !capable(CAP_KILL)) {
> + switch (sig) {
> + case SIGCONT:
> + read_lock(&tasklist_lock);
> + sid = task_session(t);
> + read_unlock(&tasklist_lock);
> + /*
> + * We don't return the error if sid == NULL. The
> + * task was unhashed, the caller must notice this.
> + */
> + if (!sid || sid == task_session(current))
> + break;
> + default:
> + return -EPERM;
> + }
> + }
>
> return security_task_kill(t, info, sig, 0);
> }
> --- 25/security/commoncap.c~CKP_TAKE_TASKLIST 2008-03-18 17:07:02.000000000 +0300
> +++ 25/security/commoncap.c 2008-03-18 17:21:10.000000000 +0300
> @@ -552,10 +552,6 @@ int cap_task_kill(struct task_struct *p,
> if (p->uid == current->uid)
> return 0;
>
> - /* sigcont is permitted within same session */
> - if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))
> - return 0;
> -
> if (secid)
> /*
> * Signal sent as a particular user.

2008-03-19 22:31:53

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH] signals: check_kill_permission: check session under tasklist_lock

Quoting Oleg Nesterov ([email protected]):
> On 03/18, [email protected] wrote:
> >
> > Quoting Oleg Nesterov ([email protected]):
> > > --- 25/security/commoncap.c~CKP_TAKE_TASKLIST 2008-03-18 17:07:02.000000000 +0300
> > > +++ 25/security/commoncap.c 2008-03-18 17:21:10.000000000 +0300
> > > @@ -552,10 +552,6 @@ int cap_task_kill(struct task_struct *p,
> > > if (p->uid == current->uid)
> > > return 0;
> > >
> > > - /* sigcont is permitted within same session */
> > > - if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))
> > > - return 0;
> > > -
> > > if (secid)
> > > /*
> > > * Signal sent as a particular user.
> >
> > Note that cap_task_kill() should be gone anyway. What tree were you
> > basing this on?
>
> Ah. I realy hoped that cap_task_kill() was already killed. And I googled
> this patch: http://marc.info/?l=linux-kernel&m=120422062515386
>
> But I checked 2.6.25-rc5-mm1.bz2, it is still here. And I didn't find
> anything related in http://userweb.kernel.org/~akpm/mmotm/. I even checked
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=security/commoncap.c
>
> So, is it still here or killed? If it is dead - great ;)

Hmm, I thought it had been pulled in. I don't see it. I'll re-spin and
re-send against -mm, -stable, and -git.

thanks,
-serge

2008-03-19 22:55:39

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] signals: check_kill_permission: check session under tasklist_lock

(on top of signals-cleanup-security_task_kill-usage-implementation.patch)

This wasn't documented, but as Atsushi Tsuji <[email protected]> pointed
out check_kill_permission() needs tasklist_lock for task_session_nr().
I missed this fact when removed tasklist from the callers.

Change check_kill_permission() to take tasklist_lock for the SIGCONT case.
Re-order security checks so that we take tasklist_lock only if/when it is
actually needed. This is a minimal fix for now, tasklist will be removed
later.

Also change the code to use task_session() instead of task_session_nr().

Also, remove the SIGCONT check from cap_task_kill(), it is bogus (and the
whole function is bogus. Serge, Eric, why it is still alive?).

Signed-off-by: Oleg Nesterov <[email protected]>

--- 25/kernel/signal.c~CKP_TAKE_TASKLIST 2008-03-18 14:47:00.000000000 +0300
+++ 25/kernel/signal.c 2008-03-18 17:25:19.000000000 +0300
@@ -533,6 +533,7 @@ static int rm_from_queue(unsigned long m
static int check_kill_permission(int sig, struct siginfo *info,
struct task_struct *t)
{
+ struct pid *sid;
int error;

if (!valid_signal(sig))
@@ -545,11 +546,24 @@ static int check_kill_permission(int sig
if (error)
return error;

- if (((sig != SIGCONT) || (task_session_nr(current) != task_session_nr(t)))
- && (current->euid ^ t->suid) && (current->euid ^ t->uid)
- && (current->uid ^ t->suid) && (current->uid ^ t->uid)
- && !capable(CAP_KILL))
- return -EPERM;
+ if ((current->euid ^ t->suid) && (current->euid ^ t->uid) &&
+ (current->uid ^ t->suid) && (current->uid ^ t->uid) &&
+ !capable(CAP_KILL)) {
+ switch (sig) {
+ case SIGCONT:
+ read_lock(&tasklist_lock);
+ sid = task_session(t);
+ read_unlock(&tasklist_lock);
+ /*
+ * We don't return the error if sid == NULL. The
+ * task was unhashed, the caller must notice this.
+ */
+ if (!sid || sid == task_session(current))
+ break;
+ default:
+ return -EPERM;
+ }
+ }

return security_task_kill(t, info, sig, 0);
}
--- 25/security/commoncap.c~CKP_TAKE_TASKLIST 2008-03-18 17:07:02.000000000 +0300
+++ 25/security/commoncap.c 2008-03-18 17:21:10.000000000 +0300
@@ -552,10 +552,6 @@ int cap_task_kill(struct task_struct *p,
if (p->uid == current->uid)
return 0;

- /* sigcont is permitted within same session */
- if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))
- return 0;
-
if (secid)
/*
* Signal sent as a particular user.

2008-03-19 22:59:00

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] signals: check_kill_permission: check session under tasklist_lock

On 03/18, [email protected] wrote:
>
> Quoting Oleg Nesterov ([email protected]):
> > --- 25/security/commoncap.c~CKP_TAKE_TASKLIST 2008-03-18 17:07:02.000000000 +0300
> > +++ 25/security/commoncap.c 2008-03-18 17:21:10.000000000 +0300
> > @@ -552,10 +552,6 @@ int cap_task_kill(struct task_struct *p,
> > if (p->uid == current->uid)
> > return 0;
> >
> > - /* sigcont is permitted within same session */
> > - if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))
> > - return 0;
> > -
> > if (secid)
> > /*
> > * Signal sent as a particular user.
>
> Note that cap_task_kill() should be gone anyway. What tree were you
> basing this on?

Ah. I realy hoped that cap_task_kill() was already killed. And I googled
this patch: http://marc.info/?l=linux-kernel&m=120422062515386

But I checked 2.6.25-rc5-mm1.bz2, it is still here. And I didn't find
anything related in http://userweb.kernel.org/~akpm/mmotm/. I even checked
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=security/commoncap.c

So, is it still here or killed? If it is dead - great ;)

Oleg.

2008-03-19 23:36:08

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH] signals: check_kill_permission: check session under tasklist_lock

Quoting Oleg Nesterov ([email protected]):
> (on top of signals-cleanup-security_task_kill-usage-implementation.patch)
>
> This wasn't documented, but as Atsushi Tsuji <[email protected]> pointed
> out check_kill_permission() needs tasklist_lock for task_session_nr().
> I missed this fact when removed tasklist from the callers.
>
> Change check_kill_permission() to take tasklist_lock for the SIGCONT case.
> Re-order security checks so that we take tasklist_lock only if/when it is
> actually needed. This is a minimal fix for now, tasklist will be removed
> later.
>
> Also change the code to use task_session() instead of task_session_nr().
>
> Also, remove the SIGCONT check from cap_task_kill(), it is bogus (and the
> whole function is bogus. Serge, Eric, why it is still alive?).
>
> Signed-off-by: Oleg Nesterov <[email protected]>
>
> --- 25/kernel/signal.c~CKP_TAKE_TASKLIST 2008-03-18 14:47:00.000000000 +0300
> +++ 25/kernel/signal.c 2008-03-18 17:25:19.000000000 +0300
> @@ -533,6 +533,7 @@ static int rm_from_queue(unsigned long m
> static int check_kill_permission(int sig, struct siginfo *info,
> struct task_struct *t)
> {
> + struct pid *sid;
> int error;
>
> if (!valid_signal(sig))
> @@ -545,11 +546,24 @@ static int check_kill_permission(int sig
> if (error)
> return error;
>
> - if (((sig != SIGCONT) || (task_session_nr(current) != task_session_nr(t)))
> - && (current->euid ^ t->suid) && (current->euid ^ t->uid)
> - && (current->uid ^ t->suid) && (current->uid ^ t->uid)
> - && !capable(CAP_KILL))
> - return -EPERM;
> + if ((current->euid ^ t->suid) && (current->euid ^ t->uid) &&
> + (current->uid ^ t->suid) && (current->uid ^ t->uid) &&
> + !capable(CAP_KILL)) {
> + switch (sig) {
> + case SIGCONT:
> + read_lock(&tasklist_lock);
> + sid = task_session(t);
> + read_unlock(&tasklist_lock);
> + /*
> + * We don't return the error if sid == NULL. The
> + * task was unhashed, the caller must notice this.
> + */
> + if (!sid || sid == task_session(current))
> + break;

Nice, in addition to a bugfix this is also far more readable.

> + default:
> + return -EPERM;
> + }
> + }
>
> return security_task_kill(t, info, sig, 0);
> }
> --- 25/security/commoncap.c~CKP_TAKE_TASKLIST 2008-03-18 17:07:02.000000000 +0300
> +++ 25/security/commoncap.c 2008-03-18 17:21:10.000000000 +0300
> @@ -552,10 +552,6 @@ int cap_task_kill(struct task_struct *p,
> if (p->uid == current->uid)
> return 0;
>
> - /* sigcont is permitted within same session */
> - if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))
> - return 0;
> -
> if (secid)
> /*
> * Signal sent as a particular user.

Note that cap_task_kill() should be gone anyway. What tree were you
basing this on?

thanks,
-serge