2007-05-29 03:23:17

by Albert Cahalan

[permalink] [raw]
Subject: Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

Jan Engelhardt writes:
> On Apr 10 2007 17:47, Jan Engelhardt wrote:
>> On Apr 8 2007 20:57, Oleg Nesterov wrote:

>>> Anyway, re-parenting to swapper breaks pstree, it doesn't
>>> show kernel threads. And if ->parent == /sbin/init, we can't
>>> remove us from ->children (unless we forbid sub-thread-of-init
>>> exec). So the only safe change is set ->exit_state = -1.
>>
>> Then we have to fix pstree and all that. (In fact, I'm
>> trying to patch `ps f` to DTRT ;p)
>
> Done that and the result is that `ps afwx` now looks like:
>
> PID TTY STAT TIME COMMAND
> 2722 ? S 0:00 [lockd]
...
> 3 ? S< 0:00 [events/0]
> 2 ? SN 0:00 [ksoftirqd/0]
> 1 ? Ss 0:02 init [3]
> 537 ? S<s 0:02 \_ /sbin/udevd --daemon
> 1600 ? Ss 0:00 \_ /usr/bin/dbus-daemon --system
> 1692 ? Ss 0:00 \_ /sbin/acpid
> 1923 ? Ss 0:00 \_ /sbin/resmgrd
...
> - if(self_pid==1 && ADOPTED(processes[i]) && forest_type!='u')
> + if(ADOPTED(processes[i]) && forest_type!='u')

That's not compatible because init's children are now in the
logical place. Since the days of procps-1.x.x or earlier,
such processes have been listed at top level.

BTW, what does "ps -ejH" do for you, with and without the patch?

I'd be a lot happier about breaking compatibility in this area
if I could get a functional adoption flag. That is, I really
would like to show a process as child of init if it naturally
was created as a child of init. It's less informative to have
fake children showing up the same as real ones. The original
parent PID would do. (BTW, the original parent name and/or
grandparent PID would be great to have) As a bonus, the kernel
could reap these processes more quickly than init can... and
then maybe we can stop caring if init is alive.


2007-05-29 04:55:07

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

"Albert Cahalan" <[email protected]> writes:

> Jan Engelhardt writes:
>> On Apr 10 2007 17:47, Jan Engelhardt wrote:
>>> On Apr 8 2007 20:57, Oleg Nesterov wrote:
>
>>>> Anyway, re-parenting to swapper breaks pstree, it doesn't
>>>> show kernel threads. And if ->parent == /sbin/init, we can't
>>>> remove us from ->children (unless we forbid sub-thread-of-init
>>>> exec). So the only safe change is set ->exit_state = -1.
>>>
>>> Then we have to fix pstree and all that. (In fact, I'm
>>> trying to patch `ps f` to DTRT ;p)
>>
>> Done that and the result is that `ps afwx` now looks like:
>>
>> PID TTY STAT TIME COMMAND
>> 2722 ? S 0:00 [lockd]
> ...
>> 3 ? S< 0:00 [events/0]
>> 2 ? SN 0:00 [ksoftirqd/0]
>> 1 ? Ss 0:02 init [3]
>> 537 ? S<s 0:02 \_ /sbin/udevd --daemon
>> 1600 ? Ss 0:00 \_ /usr/bin/dbus-daemon --system
>> 1692 ? Ss 0:00 \_ /sbin/acpid
>> 1923 ? Ss 0:00 \_ /sbin/resmgrd
> ...
>> - if(self_pid==1 && ADOPTED(processes[i]) && forest_type!='u')
>> + if(ADOPTED(processes[i]) && forest_type!='u')
>
> That's not compatible because init's children are now in the
> logical place. Since the days of procps-1.x.x or earlier,
> such processes have been listed at top level.
>
> BTW, what does "ps -ejH" do for you, with and without the patch?

ps -ejH displays everything. For 2.6.22 we will only have kthreadd
as a sibling of init with ppid == 0. Depending on what happens
in the evolution of how we start kernel thread we may be able
to remove kthreadd and have all kthreads with a ppid of 0, but only
time will tell.

> I'd be a lot happier about breaking compatibility in this area
> if I could get a functional adoption flag. That is, I really
> would like to show a process as child of init if it naturally
> was created as a child of init. It's less informative to have
> fake children showing up the same as real ones. The original
> parent PID would do. (BTW, the original parent name and/or
> grandparent PID would be great to have) As a bonus, the kernel
> could reap these processes more quickly than init can... and
> then maybe we can stop caring if init is alive.

Having the kernel not reparent user processes to init is an interesting
idea, especially when those processes have not existed. I'm not
certain that is POSIX complaint and otherwise backwards compatible.

Eric



2007-05-29 05:56:58

by Roland McGrath

[permalink] [raw]
Subject: Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

> Having the kernel not reparent user processes to init is an interesting
> idea, especially when those processes have not existed. I'm not
> certain that is POSIX complaint and otherwise backwards compatible.

It's hard to see how it would work. There has to be some parent PID. The
reason using 1 makes sense is that it is always there. Anything >0 and not
the PID of some live process could be reused for a new process at some point.


Thanks,
Roland

2007-05-30 00:33:54

by Albert Cahalan

[permalink] [raw]
Subject: Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

On 5/29/07, Eric W. Biederman <[email protected]> wrote:
> "Albert Cahalan" <[email protected]> writes:
> > Jan Engelhardt writes:

>>> - if(self_pid==1 && ADOPTED(processes[i]) && forest_type!='u')
>>> + if(ADOPTED(processes[i]) && forest_type!='u')
>>
>> That's not compatible because init's children are now in the
>> logical place. Since the days of procps-1.x.x or earlier,
>> such processes have been listed at top level.
>>
>> BTW, what does "ps -ejH" do for you, with and without the patch?
>
> ps -ejH displays everything.

That's not what I mean. (the "-e" causes that of course)
I'm asking about the parent-child relationships shown.
The "-H" option is a bit different from the "f" option.

>> I'd be a lot happier about breaking compatibility in this area
>> if I could get a functional adoption flag. That is, I really
>> would like to show a process as child of init if it naturally
>> was created as a child of init. It's less informative to have
>> fake children showing up the same as real ones. The original
>> parent PID would do. (BTW, the original parent name and/or
>> grandparent PID would be great to have) As a bonus, the kernel
>> could reap these processes more quickly than init can... and
>> then maybe we can stop caring if init is alive.
>
> Having the kernel not reparent user processes to init is an interesting
> idea, especially when those processes have not existed. I'm not
> certain that is POSIX complaint and otherwise backwards compatible.

I'm not suggesting that this be visible via POSIX APIs.

It's almost certainly a given that getppid() must return 1, and
probably /proc needs to show this as well. Without question,
any process created by init must be reaped by init.

Processes NOT created by init could be silently reaped by
the kernel. They need to see their own PPID as 1, but there
need not be any parent-child relationship in the kernel data
structures. The kernel can fake the whole thing, which is nice
because then the kernel isn't depending on userspace to
correctly perform the pointless action of playing with zombies.
(might setting the death signal to 0 be useful here?)

For "ps fax" and such, I'd like to distinguish between init's
real and adopted children. Right now the adopted children
look like they were created by init, which is not true. I only
need a simple boolean flag, set upon reparenting, to tell me.
Such a flag may also be useful for optimizing away the whole
wait/waitpid/wait4/waitid/wait3 nonsense when an adopted
child dies.

2007-05-30 01:44:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

"Albert Cahalan" <[email protected]> writes:

> On 5/29/07, Eric W. Biederman <[email protected]> wrote:
>> "Albert Cahalan" <[email protected]> writes:
>
> That's not what I mean. (the "-e" causes that of course)
> I'm asking about the parent-child relationships shown.
> The "-H" option is a bit different from the "f" option.

Yes. Sorry on the unmodified ps the parent-child relationship
seems to be displayed properly.

>>> I'd be a lot happier about breaking compatibility in this area
>>> if I could get a functional adoption flag. That is, I really
>>> would like to show a process as child of init if it naturally
>>> was created as a child of init. It's less informative to have
>>> fake children showing up the same as real ones. The original
>>> parent PID would do. (BTW, the original parent name and/or
>>> grandparent PID would be great to have) As a bonus, the kernel
>>> could reap these processes more quickly than init can... and
>>> then maybe we can stop caring if init is alive.
>>
>> Having the kernel not reparent user processes to init is an interesting
>> idea, especially when those processes have not existed. I'm not
>> certain that is POSIX complaint and otherwise backwards compatible.
>
> I'm not suggesting that this be visible via POSIX APIs.
>
> It's almost certainly a given that getppid() must return 1, and
> probably /proc needs to show this as well. Without question,
> any process created by init must be reaped by init.
>
> Processes NOT created by init could be silently reaped by
> the kernel. They need to see their own PPID as 1, but there
> need not be any parent-child relationship in the kernel data
> structures. The kernel can fake the whole thing, which is nice
> because then the kernel isn't depending on userspace to
> correctly perform the pointless action of playing with zombies.
> (might setting the death signal to 0 be useful here?)
>
> For "ps fax" and such, I'd like to distinguish between init's
> real and adopted children. Right now the adopted children
> look like they were created by init, which is not true. I only
> need a simple boolean flag, set upon reparenting, to tell me.
> Such a flag may also be useful for optimizing away the whole
> wait/waitpid/wait4/waitid/wait3 nonsense when an adopted
> child dies.

I will keep it in mind. A simple this process has been reparented
flag probably won't be too bad. As for the rest I'm not certain.

With pid namespaces there is a certain sense in doing something like
this, but I'm not certain /sbin/init and all of it's replacements
don't care (although admittedly it would be a stretch to tell the
difference).

Eric