2013-06-11 01:29:46

by vcaputo

[permalink] [raw]
Subject: adopt(pid_t pid) syscall proposal [patch included]

Hello all,

I'd like to propose a new system call, I know I'll probably be flamed,
but here goes.

Today there is no way to get a process which has been orphaned, and thus
become a child of init, back as a child of another process. This is
annoying me, please allow me to explain why.

The CONFIG_CHECKPOINT_RESTORE feature enables a useful little thing in
proc which I've been taking advantage of to make some interesting
omnipresent desktop monitoring features built into my own composited
window manager. That thing is the /proc/$pid/task/$pid/children file.

Using this file, I've created scoped per-window process monitors which
can efficiently limit their monitoring to only the client pid of the X
window and its decendants. These monitors are always running, but
visibility is toggled when desired by a keypress, efficiency is
important and the children file helps tremendously in this regard.

So I'm thrilled with all this and everything is fantastically efficient
and simple, until the first time I detach from my screen session and
reattach.

Initially, I'll have a scenario like:

xterm
bash
screen -S stuff
screen -S stuff
bash
bash
top
bash
vi foo.c
bash
make
sh -c cc -Wall ...
cc1 -quiet -I ...
bash

Behind these rows of text I have user/cpu graphs for each process
sliding by.

After detaching from screen, then reattaching, this is what the above
turns into, as you can probably predict:

xterm
bash
screen -dr stuff

All subsequent processes created through my interactions with this
screen are lost to the monitoring, since the "screen -S stuff" backend
is now stuck being a child of init.

With the attached adopt() syscall patch, and a 1 line change to screen's
attacher function invoking adopt(), I get the following on reattach:

xterm
bash
screen -dr stuff
screen -S stuff
bash
bash
top
bash
vi foo.c
bash
make
sh -c cc -Wall ...
cc1 -quiet -I ...


Which makes me very happy, dance around the room happy, and I'm not a
dancer.

Here is a screenshot of the results enabled by the patch:
http://pengaru.com/~swivel/vwm/screen_sys_adopt.png

Thanks for reading, and hopefuly considering this addition, I'm sure
there are other uses as well. Please CC me in any replies as I'm not
currently subscribed.

Regards,
Vito Caputo


Attachments:
(No filename) (2.33 kB)
0001-sched-implement-adopt-system-call.patch (4.11 kB)
Download all attachments

2013-06-11 16:53:35

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: adopt(pid_t pid) syscall proposal [patch included]

On 06/10/2013 06:23 PM, [email protected] wrote:
> + if (!uid_eq(cred->euid, tcred->suid) &&
> + !uid_eq(cred->euid, tcred->uid) &&
> + !uid_eq(cred->uid, tcred->suid) &&
> + !uid_eq(cred->uid, tcred->uid) &&
> + !ns_capable(cred->user_ns, CAP_KILL)) {
> + ret = -EPERM;
> + goto out_unlock;
> + }
> +

That check's far too permissive.

This sounds like it will break anything that uses wait and expects its
children to not be stolen out from under it.

Also, you'll have problems with screen -x or the default tmux shareable
configuration. It sounds like this is better done in userspace.

--Andy

2013-06-11 18:38:27

by vcaputo

[permalink] [raw]
Subject: Re: adopt(pid_t pid) syscall proposal [patch included]

On Tue, Jun 11, 2013 at 09:48:22AM -0700, Andy Lutomirski wrote:
> On 06/10/2013 06:23 PM, [email protected] wrote:
> >+ if (!uid_eq(cred->euid, tcred->suid) &&
> >+ !uid_eq(cred->euid, tcred->uid) &&
> >+ !uid_eq(cred->uid, tcred->suid) &&
> >+ !uid_eq(cred->uid, tcred->uid) &&
> >+ !ns_capable(cred->user_ns, CAP_KILL)) {
> >+ ret = -EPERM;
> >+ goto out_unlock;
> >+ }
> >+
>
> That check's far too permissive.

I suspected, but that's easily improved, I just wanted to get this out
there and see what people thought, start the discussion, as well as
get my use case functional. Although I'm curious, what is your cause
for concern with the existing checks?

>
> This sounds like it will break anything that uses wait and expects its
> children to not be stolen out from under it.

This thought crossed my mind, and originally I intended to restrict
adoption to orphans who had become children of init. It seemed like
more general use might be desirable so I left that out. If this
really is a possibility worth preventing we could put that restriction
on it. To the best of my knowledge, nothing informs init of its
becoming parent of an orphan, so we should be able to steal the orphan
back without harm. This still enables the use case of screen/tmux
reattachment.

>
> Also, you'll have problems with screen -x or the default tmux shareable
> configuration. It sounds like this is better done in userspace.

It just needs some determination of "session leader" applied to which
attach adopts the backend before invoking adopt(), that logic goes in
userspace. I imagine the implementations of both screen and tmux
already have such determinations happening for other reasons.

Regards,
Vito Caputo

2013-06-11 18:47:34

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: adopt(pid_t pid) syscall proposal [patch included]

On Tue, Jun 11, 2013 at 11:38 AM, <[email protected]> wrote:
> On Tue, Jun 11, 2013 at 09:48:22AM -0700, Andy Lutomirski wrote:
>> On 06/10/2013 06:23 PM, [email protected] wrote:
>> >+ if (!uid_eq(cred->euid, tcred->suid) &&
>> >+ !uid_eq(cred->euid, tcred->uid) &&
>> >+ !uid_eq(cred->uid, tcred->suid) &&
>> >+ !uid_eq(cred->uid, tcred->uid) &&
>> >+ !ns_capable(cred->user_ns, CAP_KILL)) {
>> >+ ret = -EPERM;
>> >+ goto out_unlock;
>> >+ }
>> >+
>>
>> That check's far too permissive.
>
> I suspected, but that's easily improved, I just wanted to get this out
> there and see what people thought, start the discussion, as well as
> get my use case functional. Although I'm curious, what is your cause
> for concern with the existing checks?

For example, "!uid_eq(cred->euid, tcred->uid)" means that you can
adopt setuid things. Looking at ptrace may be a good start.

>
>>
>> This sounds like it will break anything that uses wait and expects its
>> children to not be stolen out from under it.
>
> This thought crossed my mind, and originally I intended to restrict
> adoption to orphans who had become children of init. It seemed like
> more general use might be desirable so I left that out. If this
> really is a possibility worth preventing we could put that restriction
> on it. To the best of my knowledge, nothing informs init of its
> becoming parent of an orphan, so we should be able to steal the orphan
> back without harm. This still enables the use case of screen/tmux
> reattachment.
>
>>
>> Also, you'll have problems with screen -x or the default tmux shareable
>> configuration. It sounds like this is better done in userspace.
>
> It just needs some determination of "session leader" applied to which
> attach adopts the backend before invoking adopt(), that logic goes in
> userspace. I imagine the implementations of both screen and tmux
> already have such determinations happening for other reasons.

What I mean is: why not just teach your userspace tool to do this
without kernel help?

--Andy