LinuxLists.cc - Re: [RFC,PATCH] use rcu for fasync

2004-01-02 21:31:46

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

Jamie Lokier wrote:

> We have found the performance impact of the extra ->poll calls
> negligable with epoll. They're simply not slow calls. It's
> only when you're doing select() or poll() of many descriptors
> repeatedly that you notice, and that's already poor usage in other
> ways.

I do agree with you, but there is a lot of old software, and software
written on/for BSD, which does do this. I'm not prepared to say that BSD
does it better, but it's easier to fix in one place, the kernel, than
many other places.

Your point about the complexity is also correct, but perhaps someone
will offer a better solution to speeding up select(). I think anything
as major as this might be better off in a development series, and that's
a clear prod for someone to find a simpler way to do it ;-)

Old programs grow; INN uses select and worked fine with 10-20 peers,
with 200 peers sharing 2m articles and 1 TB of data it seems to work
less well on Linux than BSD or Solaris. I'd love to see faster, there
are lots of other servers out there as well.

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2004-01-02 22:42:13

by Jamie Lokier

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

Bill Davidsen wrote:
> Jamie Lokier wrote:
> >We have found the performance impact of the extra ->poll calls
> >negligable with epoll. They're simply not slow calls. It's
> >only when you're doing select() or poll() of many descriptors
> >repeatedly that you notice, and that's already poor usage in other
> >ways.
>
> I do agree with you, but there is a lot of old software, and software
> written on/for BSD, which does do this. I'm not prepared to say that BSD
> does it better, but it's easier to fix in one place, the kernel, than
> many other places.
>
> Your point about the complexity is also correct, but perhaps someone
> will offer a better solution to speeding up select(). I think anything
> as major as this might be better off in a development series, and that's
> a clear prod for someone to find a simpler way to do it ;-)

Eliminating up to half of the ->poll calls using wake_up_info() and
reducing the number of wakeups using an event mask argument to ->poll
are not the best ways to speed up select() or poll() for large numbers
of descriptors.

The best way is to maintain poll state in each "struct file". The
order of complexity for the bitmap scan is still significant, but
->poll calls are limited to the number of transitions which actually
happen.

I think somebody, maybe Richard Gooch, has a patch to do this that's
several years old by now.

-- Jamie

2004-01-03 01:09:35

by Mike Fedyk

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

On Fri, Jan 02, 2004 at 10:41:50PM +0000, Jamie Lokier wrote:
> The best way is to maintain poll state in each "struct file". The
> order of complexity for the bitmap scan is still significant, but
> ->poll calls are limited to the number of transitions which actually
> happen.

What's the drawback to this approach?

Where is the poll state kept now?

> I think somebody, maybe Richard Gooch, has a patch to do this that's
> several years old by now.

Why wasn't it merged?

Implementation issues?

2004-01-03 21:29:00

by Jamie Lokier

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

Mike Fedyk wrote:
> On Fri, Jan 02, 2004 at 10:41:50PM +0000, Jamie Lokier wrote:
> > The best way is to maintain poll state in each "struct file". The
> > order of complexity for the bitmap scan is still significant, but
> > ->poll calls are limited to the number of transitions which actually
> > happen.
>
> What's the drawback to this approach?
>
> Where is the poll state kept now?

The poll state is not maintained at all _between_ calls to poll/select
at the moment, so at least one fresh call to ->poll is required per
file descriptor. That is something that can be changed.

> > I think somebody, maybe Richard Gooch, has a patch to do this that's
> > several years old by now.
>
> Why wasn't it merged?
> Implementation issues?

The impression I had was that the code is quite complicated and
invasive, and select/poll aren't considered worth optimising because
epoll is an overall better solution (which is true; optimising
select/poll would change the complexity of the slow part but not
reduce the complexity of the API part, while epoll does both).

See ftp://ftp.atnf.csiro.au/pub/people/rgooch/linux/kernel-patches/v2.1/fastpoll-readme

-- Jamie

2004-01-04 19:03:57

by Ingo Oeser

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 03 January 2004 22:28, Jamie Lokier wrote:
> Mike Fedyk wrote:
> > On Fri, Jan 02, 2004 at 10:41:50PM +0000, Jamie Lokier wrote:
> > > The best way is to maintain poll state in each "struct file". The
> > > order of complexity for the bitmap scan is still significant, but
> > > ->poll calls are limited to the number of transitions which actually
> > > happen.
> >
> > What's the drawback to this approach?
> >
> > Where is the poll state kept now?
>
> The poll state is not maintained at all _between_ calls to poll/select
> at the moment, so at least one fresh call to ->poll is required per
> file descriptor. That is something that can be changed.

Yes, file->f_mode can be hijacked for this. Only 2 bits of it are used at the
moment. More headache is clearing this state again, but this might not be
necessary, since we can always return EAGAIN, if the cache is stale,
right?

> The impression I had was that the code is quite complicated and
> invasive, and select/poll aren't considered worth optimising because
> epoll is an overall better solution (which is true; optimising
> select/poll would change the complexity of the slow part but not
> reduce the complexity of the API part, while epoll does both).

This is true. But old software continues to exist and for INN there is
pretty much nothing else in this category available, I've been told by
several admins. Nobody really likes it, but it is used and improved
where necessary (epoll might be on the list already).

Regards

Ingo Oeser

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/+GMqU56oYWuOrkARAlC5AJ4sX3OvARw0lE7n35tvr0NfeUkJGgCgmUt6
PuPC9O9DMZt+bCNIiUa/viU=
=d23f
-----END PGP SIGNATURE-----

2004-01-04 19:20:34

by Davide Libenzi

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

On Sun, 4 Jan 2004, Ingo Oeser wrote:

> > The impression I had was that the code is quite complicated and
> > invasive, and select/poll aren't considered worth optimising because
> > epoll is an overall better solution (which is true; optimising
> > select/poll would change the complexity of the slow part but not
> > reduce the complexity of the API part, while epoll does both).
>
> This is true. But old software continues to exist and for INN there is
> pretty much nothing else in this category available, I've been told by
> several admins. Nobody really likes it, but it is used and improved
> where necessary (epoll might be on the list already).

The problem with poll/select is not the Linux implementation. It is the
API that is flawed when applied to large fd sets. Every call pass to the
system the whole fd set, and this makes the API O(N) by definition. While
poll/select are perfectly ok for small fd sets, epoll LT might enable the
application to migrate from poll/select to epoll pretty quickly (if the
application architecture is fairly sane). For example, it took about 15
minutes to me to make an epoll'd thttpd.

- Davide

2004-01-05 21:19:13

by Ingo Oeser

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Davide,
hi lkml,

On Sunday 04 January 2004 20:20, you wrote:
> The problem with poll/select is not the Linux implementation. It is the
> API that is flawed when applied to large fd sets. Every call pass to the
> system the whole fd set, and this makes the API O(N) by definition. While
> poll/select are perfectly ok for small fd sets, epoll LT might enable the
> application to migrate from poll/select to epoll pretty quickly (if the
> application architecture is fairly sane). For example, it took about 15
> minutes to me to make an epoll'd thttpd.

Yes, I've read your analysis several years ago already and I'm the first
one lobbying for epoll, but look at the posting stating, that INN sucks
under Linux currently, but doesn't suck that hard under FreeBSD and
Solaris.

There are already enough things you cannot do properly under Linux
(which are mostly not Linux' fault, but still), so I don't want to add
another one. Especially in the server market, where the M$ lobbyists are
growing their market share.

But if there is some minimal funding available (50 EUR?), I would do it
myself and push the patches upstream ;-)

Regards

Ingo Oeser

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/+dRTU56oYWuOrkARAo60AJ9kYn39UEOvf/XR/Jx6aR4yIZWIYwCggPiT
zB84XIY75b3Z05KXS7qewbw=
=+wI/
-----END PGP SIGNATURE-----

2004-01-05 22:28:46

by Davide Libenzi

[permalink] [raw]

Subject: Re: [RFC,PATCH] use rcu for fasync_lock

On Mon, 5 Jan 2004, Ingo Oeser wrote:

> On Sunday 04 January 2004 20:20, you wrote:
> > The problem with poll/select is not the Linux implementation. It is the
> > API that is flawed when applied to large fd sets. Every call pass to the
> > system the whole fd set, and this makes the API O(N) by definition. While
> > poll/select are perfectly ok for small fd sets, epoll LT might enable the
> > application to migrate from poll/select to epoll pretty quickly (if the
> > application architecture is fairly sane). For example, it took about 15
> > minutes to me to make an epoll'd thttpd.
>
> Yes, I've read your analysis several years ago already and I'm the first
> one lobbying for epoll, but look at the posting stating, that INN sucks
> under Linux currently, but doesn't suck that hard under FreeBSD and
> Solaris.
>
> There are already enough things you cannot do properly under Linux
> (which are mostly not Linux' fault, but still), so I don't want to add
> another one. Especially in the server market, where the M$ lobbyists are
> growing their market share.
>
>
> But if there is some minimal funding available (50 EUR?), I would do it
> myself and push the patches upstream ;-)

IIRC INN was not using multiplexing multiple client with a single task.
Wasn't it a fork-and-handle kinda server?

- Davide