2000-10-26 16:53:00

by Jonathan Lemon

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

On Thu, Oct 26, 2000 at 02:16:28AM -0700, Gideon Glass wrote:
> Jonathan Lemon wrote:
> >
> > Also, consider the following scenario for the proposed get_event():
> >
> > 1. packet arrives, queues an event.
> > 2. user retrieves event.
> > 3. second packet arrives, queues event again.
> > 4. user reads() all data.
> >
> > Now, next time around the loop, we get a notification for an event
> > when there is no data to read. The application now must be prepared
> > to handle this case (meaning no blocking read() calls can be used).
> >
> > Also, what happens if the user closes the socket after step 4 above?
>
> Depends on the implementation. If the item in the queue is the
> struct file (or whatever an fd indexes to), then the implementation
> can only queue the fd once. This also avoids the problem with
> closing sockets - close() would naturally do a list_del() or whatever
> on the struct file.
>
> At least I think it could be implemented this way...

kqueue currently does this; a close() on an fd will remove any pending
events from the queues that they are on which correspond to that fd.
I was trying to point out that it isn't as simple as it would seem at
first glance, as you have to consider an issues like this. Also, if the
implementation allows multiple event types per fd, (leading to multiple
queued events per fd) there no longer is a 1:1 mapping to something like
'struct file', and performing a list walk doesn't scale very well.
--
Jonathan


2000-10-27 00:50:52

by Alan

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

> kqueue currently does this; a close() on an fd will remove any pending
> events from the queues that they are on which correspond to that fd.

This seems an odd thing to do. Surely what you need to do is to post a
'close completed' event to the queue. This also makes more sense when you
have a threaded app and another thread may well currently be in say a read
at the time it is closed

2000-10-27 01:03:34

by Alfred Perlstein

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

* Alan Cox <[email protected]> [001026 17:50] wrote:
> > kqueue currently does this; a close() on an fd will remove any pending
> > events from the queues that they are on which correspond to that fd.
>
> This seems an odd thing to do. Surely what you need to do is to post a
> 'close completed' event to the queue. This also makes more sense when you
> have a threaded app and another thread may well currently be in say a read
> at the time it is closed

Kqueue's flexibility could allow this to be implemented, all you
would need to do is make a new filter trigger. You might need
a _bit_ of hackery to make sure those aren't removed, or one
could just add the event after clearing all pending events.

Adding a filter to be informed when a specific fd is closed is
certainly an option, it doesn't make very much sense because that
fd could then be reused quickly by something else...

but anyhow:

The point of this interface is to ask kqueue to report only on the
things you are interested in, not to generate superfluous that you
wouldn't care about. You could make such a flag if Linux adopted
this interface and I'm sure we'd be forced to adopt it, but if you
make kqueue generate info an application won't care about I don't
think that would be taken back.

--
-Alfred Perlstein - [[email protected]|[email protected]]
"I have the heart of a child; I keep it in a jar on my desk."

2000-10-27 01:12:47

by Jonathan Lemon

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

On Fri, Oct 27, 2000 at 01:50:40AM +0100, Alan Cox wrote:
> > kqueue currently does this; a close() on an fd will remove any pending
> > events from the queues that they are on which correspond to that fd.
>
> This seems an odd thing to do. Surely what you need to do is to post a
> 'close completed' event to the queue. This also makes more sense when you
> have a threaded app and another thread may well currently be in say a read
> at the time it is closed

Actually, it makes sense when you think about it. The `fd' is actually
a capability that the application uses to refer to the open file in the
kernel. If the app does a close() on the fd, it destroys this naming.

The application then has no capability left which refers to the formerly
open socket, and conversly, the kernel has no capability (name) to notify
the application of a close event. What can I say, "the fd formerly known
as X" is now gone? It would be incorrect to say that "fd X was closed",
since X no longer refers to anything, and the application may have reused
that fd for another file.

As for the multi-thread case, this would be a bug; if one thread closes
the descriptor, the other thread is going to get an EBADF when it goes
to perform the read.
--
Jonathan

2000-10-27 01:33:33

by Alan

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

> the application of a close event. What can I say, "the fd formerly known
> as X" is now gone? It would be incorrect to say that "fd X was closed",
> since X no longer refers to anything, and the application may have reused
> that fd for another file.

Which is precisely why you need to know where in the chain of events this
happened. Otherwise if I see

'read on fd 5'
'read on fd 5'

How do I know which read is for which fd in the multithreaded case

> As for the multi-thread case, this would be a bug; if one thread closes
> the descriptor, the other thread is going to get an EBADF when it goes
> to perform the read.

Another thread may already have reused the fd

2000-10-27 01:47:14

by Alfred Perlstein

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

* Alan Cox <[email protected]> [001026 18:33] wrote:
> > the application of a close event. What can I say, "the fd formerly known
> > as X" is now gone? It would be incorrect to say that "fd X was closed",
> > since X no longer refers to anything, and the application may have reused
> > that fd for another file.
>
> Which is precisely why you need to know where in the chain of events this
> happened. Otherwise if I see
>
> 'read on fd 5'
> 'read on fd 5'
>
> How do I know which read is for which fd in the multithreaded case

No you don't, you don't see anything with the current code unless
fd 5 is still around, what you're presenting to Jonathan is a
application threading problem, not something that need to be
resolved by the OS.

> > As for the multi-thread case, this would be a bug; if one thread closes
> > the descriptor, the other thread is going to get an EBADF when it goes
> > to perform the read.
>
> Another thread may already have reused the fd

This is another example of an application threading problem.

--
-Alfred Perlstein - [[email protected]|[email protected]]
"I have the heart of a child; I keep it in a jar on my desk."

2000-10-27 16:37:00

by Dan Kegel

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

Alan Cox wrote:
> > > kqueue currently does this; a close() on an fd will remove any pending
> > > events from the queues that they are on which correspond to that fd.
> >
> > the application of a close event. What can I say, "the fd formerly known
> > as X" is now gone? It would be incorrect to say that "fd X was closed",
> > since X no longer refers to anything, and the application may have reused
> > that fd for another file.
>
> Which is precisely why you need to know where in the chain of events this
> happened. Otherwise if I see
>
> 'read on fd 5'
> 'read on fd 5'
>
> How do I know which read is for which fd in the multithreaded case

That can't happen, can it? Let's say the following happens:
close(5)
accept() = 5
call kevent() and rebind fd 5
The 'close(5)' would remove the old fd 5 events. Therefore,
any fd 5 events you see returned from kevent are for the new fd 5.

(I suspect it helps that kevent() is both the only way to
bind events and the only way to pick them up; makes it harder
for one thread to sneak a new fd into the event list without
the thread calling kevent() noticing.)

- Dan

2000-10-27 16:42:58

by Alfred Perlstein

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

* Dan Kegel <[email protected]> [001027 09:40] wrote:
> Alan Cox wrote:
> > > > kqueue currently does this; a close() on an fd will remove any pending
> > > > events from the queues that they are on which correspond to that fd.
> > >
> > > the application of a close event. What can I say, "the fd formerly known
> > > as X" is now gone? It would be incorrect to say that "fd X was closed",
> > > since X no longer refers to anything, and the application may have reused
> > > that fd for another file.
> >
> > Which is precisely why you need to know where in the chain of events this
> > happened. Otherwise if I see
> >
> > 'read on fd 5'
> > 'read on fd 5'
> >
> > How do I know which read is for which fd in the multithreaded case
>
> That can't happen, can it? Let's say the following happens:
> close(5)
> accept() = 5
> call kevent() and rebind fd 5
> The 'close(5)' would remove the old fd 5 events. Therefore,
> any fd 5 events you see returned from kevent are for the new fd 5.
>
> (I suspect it helps that kevent() is both the only way to
> bind events and the only way to pick them up; makes it harder
> for one thread to sneak a new fd into the event list without
> the thread calling kevent() noticing.)

Yes, that's how it does and should work. Noticing the close()
should be done via thread communication/IPC not stuck into
kqueue.

--
-Alfred Perlstein - [[email protected]|[email protected]]
"I have the heart of a child; I keep it in a jar on my desk."

2000-10-27 23:52:26

by Terry Lambert

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

> > Which is precisely why you need to know where in the chain of events this
> > happened. Otherwise if I see
> >
> > 'read on fd 5'
> > 'read on fd 5'
> >
> > How do I know which read is for which fd in the multithreaded case
>
> That can't happen, can it? Let's say the following happens:
> close(5)
> accept() = 5
> call kevent() and rebind fd 5
> The 'close(5)' would remove the old fd 5 events. Therefore,
> any fd 5 events you see returned from kevent are for the new fd 5.
>
> (I suspect it helps that kevent() is both the only way to
> bind events and the only way to pick them up; makes it harder
> for one thread to sneak a new fd into the event list without
> the thread calling kevent() noticing.)

Strictly speaking, it can happen in two cases:

1) single acceptor thread, multiple worker threads

2) multiple anonymous "work to do" threads

In both these cases, the incoming requests from a client are
given to any thread, rather than a particular thread.

In the first case, we can have (id:executer order:event):

1:1:open 5
2:2:read 5
3:4:read 5
2:3:close 5

If thread 2 processes the close event before thread 3 processes
the read event, then when thread 3 attempts procssing, it will
fail.

Technically, this is a group ordering problem in the design of
the software, which should instead queue all events to a dispatch
thread, and the threads should use IPC to serialize processing of
serial events. This is similar to the problem with async mounted
FS recovery in event of a crash: without ordering guarantees, you
can only get to a "good" state, not necessarily "the one correct
state".

In the second case, we can have:

1:2:read 5
2:1:open 5
3:4:read 5
2:3:close 5

This is just a non-degenerate form of the first case, where we
allow thread 1 and all other threads to be identical, and don't
serialize open state initialization.

The NetWare for UNIX system uses this model. The benefit is
that all user space threads can be identical. This means that
I can use either threads or processes, and it won't matter, so
my software can run on older systems that lack "perfect" threads
models, simply by using processes, and putting client state into
shared memory.

In this case, there is no need for inter-thread synchronization;
instead, we must insist that events be dispatched sequentially,
and that the events be processed serially. This effectively
requires event processing completion notigfication from user
space to kernel space.

In NetWare for UNIX, this was accomplished using a streams MUX
which knew that the NetWare protocol was request-response. This
also permitted "busy" responses to be turned around in kernel
space, without incurring a kernel-to-user space scheduling
penalty. It also permitted "piggyback", where an ioctl to the
mux was used to respond, and combined sending a response with
the next read. This reduced protection domain crossing and the
context switch overhead by 50%. Finally, the MUX sent requests
to user space in LIFO order. This approach is called "hot engine
scheduling", in that the last reader in from user space is the
most likely to have its pages in core, so as to not need swapping
to handle the next request.

I was architect of much of the process model discussed above; as
you can see, there are some significant performance wins to be
had by building the right interfaces, and putting the code on
the right side of the user/kernel boundary.

In any case, the answer is that you can not assume that the only
correct way to solve a problem like event inversion is serialization
of events in user space (or kernel space). This is not strictly a
"threaded application implementation" issue, and it is not strictly
a kernel serialization of event delivery issue.

Another case, which NetWare did not handle, is that of rejected
authentication. Even if you went with the first model, and forced
your programmers to use expensive inter-thread synchronization, or
worse, bound each client to a single thread in the server, thus
rendering the system likely to have skewed thread load, getting
worse the longer the connection was up, you would still have the
problem of rejected authentication. A client might attempt to
send authentication followed by commands in the same packet series,
without waiting for an explicit ACK after each one (i.e. it might
attempt to implement a sliding window over a virtual circuit), and
the system on the other end might dilligently queue the events,
only to have the authentication be rejected, but with packets
queued already to user space for processing, assuming serialization
in user space. You would then need a much more complex mechanism,
to allow you to invalidate an already queued event to another
thread, which you don't know about in your thread, before you
release the interlock. Otherwise the client may get responses
without a valid authentication.

You need look no further than LDAPv3 for an example of a protocol
where this is possible (assuming X.509 certificate based SASL
authentication, where authentication is not challenge-response,
but where it consists solely of presenting ones certificate).


When considering this API, you have to consider more than just
the programming models you think are "right", you have to
consider all of the that are possible.

All in all, this is an interesting discussion. 8-).


Terry Lambert
[email protected]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

2000-10-28 00:19:35

by Dan Kegel

[permalink] [raw]
Subject: Re: kqueue microbenchmark results

Terry Lambert wrote:
>
> > > Which is precisely why you need to know where in the chain of events this
> > > happened. Otherwise if I see
> > > 'read on fd 5'
> > > 'read on fd 5'
> > > How do I know which read is for which fd in the multithreaded case
> >
> > That can't happen, can it? Let's say the following happens:
> > close(5)
> > accept() = 5
> > call kevent() and rebind fd 5
> > The 'close(5)' would remove the old fd 5 events. Therefore,
> > any fd 5 events you see returned from kevent are for the new fd 5.
>
> Strictly speaking, it can happen in two cases:
>
> 1) single acceptor thread, multiple worker threads
> 2) multiple anonymous "work to do" threads
>
> In both these cases, the incoming requests from a client are
> given to any thread, rather than a particular thread.
>
> In the first case, we can have (id:executer order:event):
>
> 1:1:open 5
> 2:2:read 5
> 3:4:read 5
> 2:3:close 5
>
> If thread 2 processes the close event before thread 3 processes
> the read event, then when thread 3 attempts procssing, it will
> fail.

You're not talking about kqueue() / kevent() here, are you?
With that interface, thread 2 would not see a close event;
instead, the other events for fd 5 would vanish from the queue.
If you were indeed talking about kqueue() / kevent(), please flesh
out the example a bit more, showing who calls kevent().

(A race that *can* happen is fd 5 could be closed by another
thread after a 'read 5' event is pulled from the event queue and
before it is processed, but that could happen with any
readiness notification API at all.)

- Dan