LinuxLists.cc - [RFC] Heads up on a series of AIO patchsets

2006-12-27 15:34:45

Subject: [RFC] Heads up on a series of AIO patchsets

Here is a quick attempt to summarize where we are heading with a bunch of
AIO patches that I'll be posting over the next few days. Because a few of
these patches have been hanging around for a bit, and have gone through
bursts of iterations from time to time, falling dormant for other phases,
the intent of this note is to help pull things together into some coherent
picture for folks to comment on the patches and arrive at a decision of
some sort.

Native linux aio (i.e using libaio) is properly supported (in the sense of
being asynchronous) only for files opened with O_DIRECT, which actually
suffices for a major (and most visible) user of AIO, i.e. databases.

However, for other types of users, e.g. Samba and other applications which
use POSIX AIO, there have been several issues outstanding for a while:

(1) The filesystem AIO patchset, attempts to address one part of the problem
which is to make regular file IO, (without O_DIRECT) asynchronous (mainly
the case of reads of uncached or partially cached files, and O_SYNC writes).

(2) Most of these other applications need the ability to process both
network events (epoll) and disk file AIO in the same loop. With POSIX AIO
they could at least sort of do this using signals (yeah, and all associated
issues). The IO_CMD_EPOLL_WAIT patch (originally from Zach Brown with
modifications from Jeff Moyer and me) addresses this problem for native
linux aio in a simple manner. Tridge has written a test harness to
try out the Samba4 event library modifications to use this. Jeff Moyer
has a modified version of pipetest for comparison.

(3) For glibc POSIX AIO to switch to using native AIO (instead of simulation
with threads) kernel changes are needed to ensure aio sigevent notification
and efficient listio support. Sebestian Dugue's patches for aio sigevent
notifications has undergone several review iterations and seems to be
in good shape now. His patch for lio_listio is pending discussion
on whether to implement it as a separate syscall rather than an additional
iocb command. Bharata B Rao has posted a patch with the syscall variation
for review.

(4) If glibc POSIX AIO switches completely to using native AIO then it
would need basic AIO support for various file types - including sockets,
pipes etc. Since it no longer will be simulating asynchronous behaviour
with threads, it expects the underlying implementation to be asynchronous.
Which is still an issue with native linux AIO, but I now think the problem
to be tractable without a lot of additional work. While (1) helps the case
for regular files, (2) now provides us an alternative infrastructure to
simulate this in kernel using async epoll and O_NONBLOCK for all pollable
fds, i.e. sockets, pipes etc. This should be good enough for working
POSIX AIO.

(5) That leaves just one more todo - implementing aio_fsync() in kernel.

Please note that all of this work is not in conflict with kevent development.
In fact it is my hope that progress made in getting these pieces of the
puzzle in place would also help us along the long term goal of eventual
convergence.

Regards
Suparna

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Lab, India

2006-12-27 16:25:41

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [RFC] Heads up on a series of AIO patchsets

2006-12-27 16:58:06

by Ingo Molnar

[permalink] [raw]

Subject: Re: [RFC] Heads up on a series of AIO patchsets

* Christoph Hellwig <[email protected]> wrote:

> The real question here is which interface we want people to use for
> these "combined" applications. Evgeny is heavily pushing kevent for
> this while other seem to prefer integration epoll into the aio
> interface. (1)
>
> I must admit that kevent seems to be the cleaner way to support this,
> although I see some advantages for the aio variant. I do think
> however that we should not actively promote two differnt interfaces
> long term.

i see no fundamental disadvantage from doing both. That way the 'market'
of applications will vote. (we have 2 other fundamental types available
as well: sync IO and poll() based IO - so it's not like we have the
choice between 2 or 1 variant, we have the choice between 4 or 3
variants)

> (1) note that there is another problem with the current kevent
> interface, and that is that it duplicates the event infrastructure
> for it's underlying subsystems instead of reusing existing code
> (e.g. inotify, epoll, dio-aio). If we want kevent to be _the_
> unified event system for Linux we need people to help out with
> straightening out these even provides as Evgeny seems to be
> unwilling/unable to do the work himself and the duplication is
> simply not acceptable.

yeah. The internal machinery should be as unified as possible - but
different sets of APIs can be offered, to make it easy for people to
extend their existing apps in the most straightforward way.

(In fact i'd like to see all the 'poll table' code to be unified into
this as well, if possible - it does not really "poll" anything, it's an
event infrastructure as well, used via the naive select() and poll()
syscalls. We should fix that naming mistake.)

Ingo

2006-12-27 17:22:04

by Ingo Molnar

[permalink] [raw]

Subject: Re: [RFC] Heads up on a series of AIO patchsets

* Ingo Molnar <[email protected]> wrote:

> > unified event system for Linux we need people to help out with
> > straightening out these even provides as Evgeny seems to be
> > unwilling/unable to do the work himself and the duplication is
> > simply not acceptable.
>
> yeah. The internal machinery should be as unified as possible - but
> different sets of APIs can be offered, to make it easy for people to
> extend their existing apps in the most straightforward way.

just to expand on this: i dont think this should be an impediment to the
POSIX AIO patches. We should get some movement into this and should give
the capability to glibc and applications. Kernel-internal unification is
something we are pretty good at doing after the fact. (and if any of the
APIs dies or gets very uncommon we know in which direction to unify)

Ingo

2006-12-28 08:18:48

Subject: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: [FSAIO][PATCH 1/6] Add a wait queue parameter to the wait_bit action routine

Subject: [FSAIO][PATCH 2/8] Rename __lock_page to lock_page_slow

Subject: [FSAIO][PATCH 3/8] Routines to initialize and test a wait bit key

Subject: [FSAIO][PATCH 4/8] Add a default io wait bit field in task struct

Subject: [FSAIO][PATCH 5/8] Enable wait bit based filtered wakeups to work for AIO

Subject: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

Subject: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: [FSAIO][PATCH 8/8] AIO O_SYNC filesystem write

Subject: Re: [FSAIO][PATCH 1/6] Add a wait queue parameter to the wait_bit action routine

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [FSAIO][PATCH 3/8] Routines to initialize and test a wait bit key

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

Subject: Re: [FSAIO][PATCH 7/8] Filesystem AIO read

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: [PATCHSET 2][PATCH 1/1] Combining epoll and disk file AIO

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

Subject: [PATCHSET 3][PATCH 0/5][AIO] - AIO completion signal notification v4

Subject: [PATCHSET 3][PATCH 1/5][AIO] - Rework compat_sys_io_submit

Attachments:

Subject: [PATCHSET 3][PATCH 2/5][AIO] - fix aio.h includes

Attachments:

Subject: [PATCHSET 3][PATCH 3/5][AIO] - Make good_sigevent non-static

Attachments:

Subject: [PATCHSET 3][PATCH 4/5][AIO] - AIO completion signal notification

Attachments:

Subject: [PATCHSET 3][PATCH 5/5][AIO] - Add listio support

Attachments:

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [RFC] Heads up on a series of AIO patchsets

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: [PATCHSET 4][PATCH 1/1] AIO fallback for pipes, sockets and pollable fds

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Subject: Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write