2007-06-08 08:49:45

by Daniel Colascione

[permalink] [raw]
Subject: O_CLOEXEC: An alternate proposal

Hey, this is my first post to linux-kernel, so please be kind. :-)

Linus Torvalds wrote on May 31:
> I'm with Uli on this one. "Stateful" stuff is bad. It's essentially
> impossible to handle with libraries - either the library would have to
> explciitly always turn the state the way _it_ needs it, or the library
> will do the wrogn thing.

I agree that stateful stuff is generally not very elegant,
but I think it's a win here -- we wouldn't have to create any
new APIs except for the state-setting stuff.

The state just has to be thread-local.

If it's thread-local, a library, say, glibc,
can use code like this:

/* Internal library function */
old_fd_flags = kernel_default_fd_flags(FD_CLOEXEC | FD_RANDFD);
event_fd = super_duper_event_polling_mechanism_fd();
kernel_default_fd_flags(old_fd_flags);

I think that's a lot cleaner than augmenting every
present and future fd-creating syscall to take some kind
of flags parameter and adding some kind of funny dup().

Other threads, and the caller of this function in the same thread,
aren't even aware that the library is changing any state. It's
race-free, since the default flags wouldn't be inherited across clone()
or exec(). It's still POSIX compliant too as long as the default
flags set remains empty.

The only disadvantage I can think of is that it requires three
system calls instead of one, but most of your time is going to be spent
working with event_fd, not creating it.

Also, what about an FD_CLOFORK as well? That seems to match
more closely what you'd want out of a library-internal FD than
FD_CLOEXEC.


2007-06-08 09:21:24

by Eric Dumazet

[permalink] [raw]
Subject: Re: O_CLOEXEC: An alternate proposal

On Fri, 8 Jun 2007 03:47:12 -0400 (EDT)
"Daniel Colascione" <[email protected]> wrote:

> Hey, this is my first post to linux-kernel, so please be kind. :-)

Welcome Daniel

>
> Linus Torvalds wrote on May 31:
> > I'm with Uli on this one. "Stateful" stuff is bad. It's essentially
> > impossible to handle with libraries - either the library would have to
> > explciitly always turn the state the way _it_ needs it, or the library
> > will do the wrogn thing.
>
> I agree that stateful stuff is generally not very elegant,
> but I think it's a win here -- we wouldn't have to create any
> new APIs except for the state-setting stuff.
>
> The state just has to be thread-local.
>
> If it's thread-local, a library, say, glibc,
> can use code like this:
>
> /* Internal library function */
> old_fd_flags = kernel_default_fd_flags(FD_CLOEXEC | FD_RANDFD);

<race here if a signal handler runs some user code messing with a thread-local fd_flags >

> event_fd = super_duper_event_polling_mechanism_fd();
> kernel_default_fd_flags(old_fd_flags);
>
> I think that's a lot cleaner than augmenting every
> present and future fd-creating syscall to take some kind
> of flags parameter and adding some kind of funny dup().
>

Thats funny, you probably missed Linus syscall_indirect() proposal,
which is basically doing the thing but with one syscall (so no races, and faster)

http://marc.info/?l=linux-kernel&m=118124716616552&w=2

2007-06-08 10:26:18

by Jakub Jelinek

[permalink] [raw]
Subject: Re: O_CLOEXEC: An alternate proposal

On Fri, Jun 08, 2007 at 03:47:12AM -0400, Daniel Colascione wrote:
> Hey, this is my first post to linux-kernel, so please be kind. :-)
>
> Linus Torvalds wrote on May 31:
> > I'm with Uli on this one. "Stateful" stuff is bad. It's essentially
> > impossible to handle with libraries - either the library would have to
> > explciitly always turn the state the way _it_ needs it, or the library
> > will do the wrogn thing.
>
> I agree that stateful stuff is generally not very elegant,
> but I think it's a win here -- we wouldn't have to create any
> new APIs except for the state-setting stuff.
>
> The state just has to be thread-local.
>
> If it's thread-local, a library, say, glibc,
> can use code like this:
>
> /* Internal library function */
> old_fd_flags = kernel_default_fd_flags(FD_CLOEXEC | FD_RANDFD);
> event_fd = super_duper_event_polling_mechanism_fd();
> kernel_default_fd_flags(old_fd_flags);

It is not a win, what if a signal comes in between the two
kernel_default_fd_flags syscalls? open and other functions
are async signal safe and programs will be certainly upset
if suddenly the syscalls in the signal handler start to behave
differently depending on which exact code the async signal
has interrupted.

Jakub