2006-11-20 13:21:01

by Simon Richter

[permalink] [raw]
Subject: RFC: implement daemon() in the kernel

[please CC me on replies]

Hi,

I'm working with Linux on MMUless systems, and one of the biggest issues
in porting software is the lack of working fork().

Except some special cases (like openssh's priviledge separation), fork()
is called in mainly three cases:

- spawn off a new process, which calls exec() immediately

This can be easily replaced by a call to vfork(), which invokes the
clone() syscall with the CLONE_VFORK flag.

- split off some work into a separate thread and provide address space
separation

Since we don't have a MMU, there is no address space separation anyway,
so we can replace this with a pthread_create(), which in turn calls clone().

- daemonize a process

There is a function called daemon() that does this; its behaviour is
roughly defined by (modulo error handling)

int daemon(int nochdir, int noclose)
{
if(!nochdir)
chdir("/");

if(!noclose)
{
int fd = open("/dev/null", O_RDWR);
dup2(fd, 0);
dup2(fd, 1);
dup2(fd, 2);
close(fd);
}

if(fork() > 0)
_exit(0);
}

Since it calls _exit() right after fork() returns (so daemon() never
returns to the calling process except in case of an error) it would be
possible to implement this on MMUless machines if the last two lines
could happen in the kernel.

I can see three possible implementations:

- "cheap" implementation

The process is assigned a new PID and the parent is pretended to have
exited. There are a lot of pitfalls here, so it is probably not a good idea.

- a reverse vfork()

The child process is created and suspended, the parent continues to run
until it calls exec() or _exit(). The good thing here is that it should
be easy to implement as the infrastructure for suspending a process
until another exits already exists.

- "normal" implementation

The child is created, the parent immediately zombiefied with a return
code of zero. This might be more difficult to implement as the current
implementation of fork() does not need to terminate a process in any
way, so there might be funny locking and other issues.

Questions? Comments?

Simon


2006-11-20 15:38:15

by Mark Rustad

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

On Nov 20, 2006, at 7:20 AM, Simon Richter wrote:

> [please CC me on replies]
>
> Hi,
>
> I'm working with Linux on MMUless systems, and one of the biggest
> issues
> in porting software is the lack of working fork().
>
> Except some special cases (like openssh's priviledge separation),
> fork()
> is called in mainly three cases:
>
> - spawn off a new process, which calls exec() immediately
>
> This can be easily replaced by a call to vfork(), which invokes the
> clone() syscall with the CLONE_VFORK flag.
>
> - split off some work into a separate thread and provide address
> space
> separation
>
> Since we don't have a MMU, there is no address space separation
> anyway,
> so we can replace this with a pthread_create(), which in turn calls
> clone().
>
> - daemonize a process
>
> There is a function called daemon() that does this; its behaviour is
> roughly defined by (modulo error handling)
>
> int daemon(int nochdir, int noclose)
> {
> if(!nochdir)
> chdir("/");
>
> if(!noclose)
> {
> int fd = open("/dev/null", O_RDWR);
> dup2(fd, 0);
> dup2(fd, 1);
> dup2(fd, 2);
> close(fd);
> }
>
> if(fork() > 0)
> _exit(0);
> }
>
> Since it calls _exit() right after fork() returns (so daemon() never
> returns to the calling process except in case of an error) it would be
> possible to implement this on MMUless machines if the last two lines
> could happen in the kernel.
>
> I can see three possible implementations:
>
> - "cheap" implementation
>
> The process is assigned a new PID and the parent is pretended to have
> exited. There are a lot of pitfalls here, so it is probably not a
> good idea.
>
> - a reverse vfork()
>
> The child process is created and suspended, the parent continues to
> run
> until it calls exec() or _exit(). The good thing here is that it
> should
> be easy to implement as the infrastructure for suspending a process
> until another exits already exists.
>
> - "normal" implementation
>
> The child is created, the parent immediately zombiefied with a return
> code of zero. This might be more difficult to implement as the current
> implementation of fork() does not need to terminate a process in any
> way, so there might be funny locking and other issues.
>
> Questions? Comments?

There is a better way. Simply implement fork(). It can be done even
without an MMU. People think it is impossible, but that is only
because they don't consider the possibility of copying memory back
and forth on task switch. It sounds horrible, but in the vast
majority of cases, either the parent or child either exits or does an
exec pretty quickly, so in reality it doesn't cost much. The benefits
are many: being able to use real shells such as bash and thereby
being able to use real shell scripts.

When I was at BRECIS we implemented this in a 2.4 uClinux kernel - as
well as in an OpenBSD port. I can't take any credit for this work - a
friend of mine did it - but at least I recognized that such a thing
was possible. Having seen the results of this before, this really is
the way to go to improve MMU-less systems.

You do have to look out for any applications that fork and do not
either exit or exec, but that is so much better than having to modify
so many things just to get them to run.

--
Mark Rustad, [email protected]

2006-11-20 17:43:04

by Simon Richter

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

Hi,

Mark Rustad schrieb:

> There is a better way. Simply implement fork(). It can be done even
> without an MMU. People think it is impossible, but that is only because
> they don't consider the possibility of copying memory back and forth on
> task switch. It sounds horrible, but in the vast majority of cases,
> either the parent or child either exits or does an exec pretty quickly,
> so in reality it doesn't cost much. The benefits are many: being able to
> use real shells such as bash and thereby being able to use real shell
> scripts.

This imposes quite a significant overhead for the common case (in which
the application has specifically requested that the parent process be
terminated after the child process is fork()ed off). Even if the cost of
transferring memory contents was cheap (which it isn't), you'd annoy the
memory management subsystem unless you did a lot of weird tricks to
avoid allocating from a large block.

> You do have to look out for any applications that fork and do not either
> exit or exec, but that is so much better than having to modify so many
> things just to get them to run.

Well, in fact just having a libc that does not define a symbol for
"fork" and then going to the places the linker mentions as having
undefined references is a pretty easy way. Mind you, in 90% of cases you
can replace them by a vfork() and be done.

Simon

2006-11-20 20:38:56

by Jan Engelhardt

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel


On Nov 20 2006 14:20, Simon Richter wrote:
>
> - a reverse vfork()
>
>The child process is created and suspended, the parent continues to run
>until it calls exec() or _exit(). The good thing here is that it should
>be easy to implement as the infrastructure for suspending a process
>until another exits already exists.

How about the Cygwin way, i.e. 'suspend' the parent and let the child
run after fork.

Your case: If it exec()s within a specific time limit, fine. If not, you
can follow the suggestion to copy its entire memory space.


-`J'
--

2006-11-20 20:48:28

by Mark Rustad

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

On Nov 20, 2006, at 11:42 AM, Simon Richter wrote:

> Mark Rustad schrieb:
>
>> There is a better way. Simply implement fork(). It can be done
>> even without an MMU. People think it is impossible, but that is
>> only because they don't consider the possibility of copying memory
>> back and forth on task switch. It sounds horrible, but in the vast
>> majority of cases, either the parent or child either exits or does
>> an exec pretty quickly, so in reality it doesn't cost much. The
>> benefits are many: being able to use real shells such as bash and
>> thereby being able to use real shell scripts.
>
> This imposes quite a significant overhead for the common case (in
> which the application has specifically requested that the parent
> process be terminated after the child process is fork()ed off).
> Even if the cost of transferring memory contents was cheap (which
> it isn't), you'd annoy the memory management subsystem unless you
> did a lot of weird tricks to avoid allocating from a large block.

Yes. I did not mean to suggest that vfork() should go away or that
shells that make use of it go away. It is just that making fork()
work has a lot of value. vfork() would always be the optimal thing to
use, but sometimes you need the power of a real fork(). Greater
compatibility with normal Linux is of greater value than adding more
funky special-purpose system calls.

>> You do have to look out for any applications that fork and do not
>> either exit or exec, but that is so much better than having to
>> modify so many things just to get them to run.
>
> Well, in fact just having a libc that does not define a symbol for
> "fork" and then going to the places the linker mentions as having
> undefined references is a pretty easy way. Mind you, in 90% of
> cases you can replace them by a vfork() and be done.

Yes, but some of those 10% cases can be a real pain. Also if you are
supporting users that just want some app to run, having fewer porting
barriers is a real help. Often the expense of fork() is only a
startup thing anyway and not a factor in the normal steady-state
operation of a system.

--
Mark Rustad, [email protected]

2006-11-21 00:38:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

Simon Richter wrote:
>
> - daemonize a process
>
> There is a function called daemon() that does this; its behaviour is
> roughly defined by (modulo error handling)
>
> int daemon(int nochdir, int noclose)
> {
> if(!nochdir)
> chdir("/");
>
> if(!noclose)
> {
> int fd = open("/dev/null", O_RDWR);
> dup2(fd, 0);
> dup2(fd, 1);
> dup2(fd, 2);
> close(fd);
> }
>
> if(fork() > 0)

... that should be if (fork() == 0) ...

> _exit(0);

setsid();
> }
>


> Since it calls _exit() right after fork() returns (so daemon() never
> returns to the calling process except in case of an error) it would be
> possible to implement this on MMUless machines if the last two lines
> could happen in the kernel.
>

You could do this quite easily with clone() and a small assembly wrapper.

The assembly wrapper needs to do the last two lines without touching the
stack in the parent. That is usually quite trivial, even on
register-starved architectures; for example, on i386 it would look like
(ignoring vsyscalls for the moment, which are only an optimization anyway).

__detach_from_parent:
pushl %ebx
movl $__NR_clone, %eax
movl $CLONE_VM|SIGCHLD, %ebx
xorl %ecx, %ecx
int $0x80
cmpl $-4096, %eax
ja 1f
andl %eax, %eax
je 2f
# Parent process, must _exit(0)
xorl %ebx, %ebx
movl $__NR_exit, %eax
int $0x80
# _exit() should never return
hlt
1: # Error on fork(), set errno and return -1
negl %eax
movl %eax, errno # Or TLS equivalent
orl $-1, %eax
2: # Child process jumps here with %eax == 0 already
popl %ebx
ret

2006-11-21 09:30:51

by Michal Schmidt

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

H. Peter Anvin wrote:
> Simon Richter wrote:
>> int daemon(int nochdir, int noclose)
>> {
>> if(!nochdir)
>> chdir("/");
>>
>> if(!noclose)
>> {
>> int fd = open("/dev/null", O_RDWR);
>> dup2(fd, 0);
>> dup2(fd, 1);
>> dup2(fd, 2);
>> close(fd);
>> }
>>
>> if(fork() > 0)
>
> ... that should be if (fork() == 0) ...

Are you sure? fork()==0 means we're the child, but it's the parent who
should exit, isn't it?

>
>> _exit(0);
>
> setsid();
>> }
>>

Michal

2006-11-21 17:15:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RFC: implement daemon() in the kernel

Michal Schmidt wrote:
> H. Peter Anvin wrote:
>> Simon Richter wrote:
>>> int daemon(int nochdir, int noclose)
>>> {
>>> if(!nochdir)
>>> chdir("/");
>>>
>>> if(!noclose)
>>> {
>>> int fd = open("/dev/null", O_RDWR);
>>> dup2(fd, 0);
>>> dup2(fd, 1);
>>> dup2(fd, 2);
>>> close(fd);
>>> }
>>>
>>> if(fork() > 0)
>>
>> ... that should be if (fork() == 0) ...
>
> Are you sure? fork()==0 means we're the child, but it's the parent who
> should exit, isn't it?
>

Oh, right, of course. Thinko; the lack of error handling confused me.
I did that right in the assembly code.

-hpa