2007-11-26 14:26:11

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

sys_setsid() still deals with pid_t's from the global namespace. This means
that the "session > 1" check can't help for sub-namespace init, setsid() can't
succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.

Remove the usage of task_struct->pid and convert the code to use "struct pid".
This also simplifies and speedups the code, saves one find_pid().

Signed-off-by: Oleg Nesterov <[email protected]>

--- PT/kernel/sys.c~1_setsid 2007-11-26 15:52:15.000000000 +0300
+++ PT/kernel/sys.c 2007-11-26 16:10:43.000000000 +0300
@@ -1045,35 +1045,33 @@ asmlinkage long sys_getsid(pid_t pid)
asmlinkage long sys_setsid(void)
{
struct task_struct *group_leader = current->group_leader;
- pid_t session;
+ struct pid *sid = task_pid(group_leader);
+ pid_t session = pid_vnr(sid);
int err = -EPERM;

write_lock_irq(&tasklist_lock);
-
/* Fail if I am already a session leader */
if (group_leader->signal->leader)
goto out;

- session = group_leader->pid;
- /* Fail if a process group id already exists that equals the
- * proposed session id.
+ /* Fail if a process group id already exists that equals the proposed
+ * session id.
*
- * Don't check if session id == 1 because kernel threads use this
- * session id and so the check will always fail and make it so
- * init cannot successfully call setsid.
+ * Don't check if session == 1 because kernel threads and CLONE_NEWPID
+ * tasks use this session id and so the check will always fail and make
+ * it so init cannot successfully call setsid.
*/
- if (session > 1 && find_task_by_pid_type_ns(PIDTYPE_PGID,
- session, &init_pid_ns))
+ if (session != 1 && pid_task(sid, PIDTYPE_PGID))
goto out;

group_leader->signal->leader = 1;
- __set_special_pids(session, session);
+ __set_special_pids(pid_nr(sid), pid_nr(sid));

spin_lock(&group_leader->sighand->siglock);
group_leader->signal->tty = NULL;
spin_unlock(&group_leader->sighand->siglock);

- err = task_pgrp_vnr(group_leader);
+ err = session;
out:
write_unlock_irq(&tasklist_lock);
return err;


2007-11-26 14:43:36

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

On 11/26, Oleg Nesterov wrote:
>
> sys_setsid() still deals with pid_t's from the global namespace. This means
> that the "session > 1" check can't help for sub-namespace init, setsid() can't
> succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.
>
> Remove the usage of task_struct->pid and convert the code to use "struct pid".
> This also simplifies and speedups the code, saves one find_pid().

????, ????? ???????. ? ?? ?????, ??? ??? ????? ??? 2.6.24, bug (???? ? ??? ???
?? ?????? ? ?? ????) ????? ??????, ?? ???-????...

??????. ??? ? ??? ???? task_struct *p = find_task_by_vpid(pid), ?????? ? ??? ???
??????? ???????? ??? pid_t ? task_pid_vnr() ?????? ?? ??, ??? ????? (??? ??? ??
????? ?????? ? ????? ?????? ?????????? ;), ? ?? ?????? ??????

task_pid_nr_ns(p, current->nsproxy->pid_ns);

????!!! :-(

Oleg.

2007-11-26 15:01:12

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

Oleg Nesterov wrote:
> sys_setsid() still deals with pid_t's from the global namespace. This means
> that the "session > 1" check can't help for sub-namespace init, setsid() can't
> succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.
>
> Remove the usage of task_struct->pid and convert the code to use "struct pid".
> This also simplifies and speedups the code, saves one find_pid().
>
> Signed-off-by: Oleg Nesterov <[email protected]>

Acked-by: Pavel Emelyanov <[email protected]>

> --- PT/kernel/sys.c~1_setsid 2007-11-26 15:52:15.000000000 +0300
> +++ PT/kernel/sys.c 2007-11-26 16:10:43.000000000 +0300
> @@ -1045,35 +1045,33 @@ asmlinkage long sys_getsid(pid_t pid)
> asmlinkage long sys_setsid(void)
> {
> struct task_struct *group_leader = current->group_leader;
> - pid_t session;
> + struct pid *sid = task_pid(group_leader);
> + pid_t session = pid_vnr(sid);
> int err = -EPERM;
>
> write_lock_irq(&tasklist_lock);
> -
> /* Fail if I am already a session leader */
> if (group_leader->signal->leader)
> goto out;
>
> - session = group_leader->pid;
> - /* Fail if a process group id already exists that equals the
> - * proposed session id.
> + /* Fail if a process group id already exists that equals the proposed
> + * session id.
> *
> - * Don't check if session id == 1 because kernel threads use this
> - * session id and so the check will always fail and make it so
> - * init cannot successfully call setsid.
> + * Don't check if session == 1 because kernel threads and CLONE_NEWPID
> + * tasks use this session id and so the check will always fail and make
> + * it so init cannot successfully call setsid.
> */
> - if (session > 1 && find_task_by_pid_type_ns(PIDTYPE_PGID,
> - session, &init_pid_ns))
> + if (session != 1 && pid_task(sid, PIDTYPE_PGID))
> goto out;
>
> group_leader->signal->leader = 1;
> - __set_special_pids(session, session);
> + __set_special_pids(pid_nr(sid), pid_nr(sid));
>
> spin_lock(&group_leader->sighand->siglock);
> group_leader->signal->tty = NULL;
> spin_unlock(&group_leader->sighand->siglock);
>
> - err = task_pgrp_vnr(group_leader);
> + err = session;
> out:
> write_unlock_irq(&tasklist_lock);
> return err;
>
>

2007-11-26 19:17:38

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

Oleg Nesterov <[email protected]> writes:

> sys_setsid() still deals with pid_t's from the global namespace. This means
> that the "session > 1" check can't help for sub-namespace init, setsid() can't
> succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.

We can do even better. We can remove the misguided code from
copy_process(CLONE_NEWPID) that populates the PIDTYPE_PGID/SID links
and generally does set setsid by hand, and the code from kernel_init
that call set_special_pid(), allowing us to remove the special case
entirely.

The set_special_pid() in kernel_init() and the special case check
is actually a work around for the fact that earlier we could not
use 0 in the pid hash table. Now that we can use init_struct_pid
directly we don't need the special case at all.

Eric

2007-11-26 20:12:48

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

On 11/26, Eric W. Biederman wrote:
>
> Oleg Nesterov <[email protected]> writes:
>
> > sys_setsid() still deals with pid_t's from the global namespace. This means
> > that the "session > 1" check can't help for sub-namespace init, setsid() can't
> > succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.
>
> We can do even better. We can remove the misguided code from
> copy_process(CLONE_NEWPID) that populates the PIDTYPE_PGID/SID links
> and generally does set setsid by hand,

Yes you are right. IIRC there was a patch from you, but I didn't follow the
discussion, sorry, so I don't know what was the verdict.

If we remove that "almost setsid" from copy_process(), we can remove the fat
comment and the "session != 1" chunk from setsid().

> and the code from kernel_init
> that call set_special_pid(), allowing us to remove the special case
> entirely.

This is different, perhaps we can keep this call. kernel_thread(kernel_init)
attaches /sbin/init to init_struct_pid. Nothing bad, and a "good" init should
do setsid() anyway. But who knows? Some special environment may expect that
getpgrp() != 0. Not that I really disagree on this issue though.

Oleg.

2007-11-26 21:41:46

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

Oleg Nesterov <[email protected]> writes:

> On 11/26, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <[email protected]> writes:
>>
>> > sys_setsid() still deals with pid_t's from the global namespace. This means
>> > that the "session > 1" check can't help for sub-namespace init, setsid()
> can't
>> > succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.
>>
>> We can do even better. We can remove the misguided code from
>> copy_process(CLONE_NEWPID) that populates the PIDTYPE_PGID/SID links
>> and generally does set setsid by hand,
>
> Yes you are right. IIRC there was a patch from you, but I didn't follow the
> discussion, sorry, so I don't know what was the verdict.

Since session == pgrp == 0 is the historical start condition for /sbin/init there
is no problem from the session perspective, it in fact is better.

The only case that might have cared was setting si_pid when sending signals,
and it turns out it is both simple and necessary to handle that case across
namespaces anyway.

So there is no reason not to handle this.

> If we remove that "almost setsid" from copy_process(), we can remove the fat
> comment and the "session != 1" chunk from setsid().
>
>> and the code from kernel_init
>> that call set_special_pid(), allowing us to remove the special case
>> entirely.
>
> This is different, perhaps we can keep this call. kernel_thread(kernel_init)
> attaches /sbin/init to init_struct_pid. Nothing bad, and a "good" init should
> do setsid() anyway. But who knows? Some special environment may expect that
> getpgrp() != 0. Not that I really disagree on this issue though.

init starting with session == pgrp == 0 is historical linux behavior. I consider
the current 2.6 behavior a temporary aberation from the historical linux behavior.

sysvinit does call setsid. And nothing really bad will happen if someone forgets
to call setsid, in some obscure version of init.

Plus once we do this the code will be easier to maintain because we have
removed one obscure special case.

Eric

2007-11-26 22:48:05

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/3] fix setsid() for sub-namespace /sbin/init

On 11/26, Eric W. Biederman wrote:
>
> Oleg Nesterov <[email protected]> writes:
>
> > This is different, perhaps we can keep this call. kernel_thread(kernel_init)
> > attaches /sbin/init to init_struct_pid. Nothing bad, and a "good" init should
> > do setsid() anyway. But who knows? Some special environment may expect that
> > getpgrp() != 0. Not that I really disagree on this issue though.
>
> init starting with session == pgrp == 0 is historical linux behavior. I consider
> the current 2.6 behavior a temporary aberation from the historical linux behavior.

Ah, OK.

> Plus once we do this the code will be easier to maintain because we have
> removed one obscure special case.

Yes indeed. So we can remove this special case code as soon as copy_process()
is changed.

Oleg.