2003-01-23 23:10:35

by Lee Chin

[permalink] [raw]
Subject: debate on 700 threads vs asynchronous code

Hi
I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views

In a nutshell, as far as this debate is concerned, I can say I am writing a web server.

Now, to cater to 700 clients, I can
a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)

OR

b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder

Which way will yeild me better performance, considerng both approaches are implemented optimally?

Thanks
Lee
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Meet Singles
http://corp.mail.com/lavalife


2003-01-23 23:19:25

by Larry McVoy

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?

If this is a serious question, an async system will by definition do better.
You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
fitting in the data cache. Ditto for instruction cache, etc.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-01-23 23:22:25

by Ben Greear

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder

You could also write something with async non-blocking IO and use NO threads
(ie, just a single process), which
may greatly simplify the debugging of your program (unless the developer(s) on your
project are very good at threaded programming already).

I suspect the async IO will perform better as well, but that is just an
un-founded opinion based on not wanting to think about scheduling 700 processes
that want to do IO :)

>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> Thanks
> Lee


--
Ben Greear <[email protected]> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear


2003-01-23 23:58:10

by Lee Chin

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Hi,
Thanks for the rpely... my question was more so, with setcontext and swapcontext, I will still be messing with the data cache right?

In otherwords, as long as I have an async system with out setcontext, I know I am good... but with it, havent I degraded to a threaded environment?

Thanks
Lee
----- Original Message -----
From: Larry McVoy <[email protected]>
Date: Thu, 23 Jan 2003 15:28:34 -0800
To: Lee Chin <[email protected]>
Subject: Re: debate on 700 threads vs asynchronous code

> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> >
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> If this is a serious question, an async system will by definition do better.
> You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
> fitting in the data cache. Ditto for instruction cache, etc.
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Meet Singles
http://corp.mail.com/lavalife

2003-01-24 01:14:39

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Lee Chin <[email protected]> wrote:
> Larry McVoy wrote:
>> > Now, to cater to 700 clients, I can
>> > a) launch 700 threads that each block on I/O to disk and to the client (in
>> > reading and writing on the socket)
>> > OR
>> > b) Write an asycnhrounous system with only 2 or three threads where I manage the
>> > connections and stack (via setcontext swapcontext etc), which is
>> > programatically a little harder
>> > Which way will yeild me better performance, considerng both approaches are
>> > implemented optimally?
>>
>> If this is a serious question, an async system will by definition do better.
>> You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
>> fitting in the data cache. Ditto for instruction cache, etc.
>
> Thanks for the rpely... my question was more so, with setcontext and swapcontext, I
> will still be messing with the data cache right?
>
> In otherwords, as long as I have an async system with out setcontext, I know I am
> good... but with it, havent I degraded to a threaded environment?

I suspect Linux's implementation of asynch I/O isn't able to handle sockets yet.
Thus the choice is between nonblocking I/O and blocking I/O.

Nonblocking I/O is totally the way to go if you have full control over your
source code and want the maximal performance in userspace. The best way
to get good performance with nonblocking I/O in Linux is to use the sys_epoll
system call; it's part of the 2.5 kernel, but a backport to 2.4 is available.

See http://www.kegel.com/c10k.html for an overview of the issue and some links.
- Dan

--
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045

2003-01-24 01:32:15

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Mark Hahn wrote:
>>Nonblocking I/O is totally the way to go if you have full control over your
>>source code and want the maximal performance in userspace. The best way
>
> why do you think it's better for user-space? I was trying to explain
> it to someone this afternoon, and we couldn't find any reason for
> threads/blocking to be slow. IO-completion wakes up the thread, which
> goes through the scheduler right back to the user's stack-frame,
> even providing the io-completion status. no large cache footprint
> anywhere (at least with a lightweight thread library), no multiplexing
> like for select/poll, etc.

I suspect the thread *does* have a larger cache footprint,
since in nonblocking I/O, session state is stored more compactly.
Also, the threaded approach involves lots more context switches.

> does epoll provide a thunk (callback and state variable) as well as the
> IO completion status?

No. It provides an event record containing a user-defined state pointer
plus the IO readiness status change (different from IO completion status).
But that's what you need; you can do the call yourself.

>>See http://www.kegel.com/c10k.html for an overview of the issue and some links.
>
>
> it's a great resource, except that for 700 clients, the difference
> between select, poll, epoll, aio are pretty moot. no?

Depends on how close to maximal performance you need, and whether
you might later need to scale to more clients.

The average server is so lightly loaded, it really doesn't matter which approach you use.
- Dan


--
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045

2003-01-24 07:49:39

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Mark Hahn wrote:
> in principle, why should the footprint be large? it's a register set
> and at most a couple cachelines of stack frame.

... but all the threads' cachelines will collide, whereas if
you're using nonblocking I/O, session state might be staggered better.
This is just a guess; I haven't measured it.

>>>does epoll provide a thunk (callback and state variable) as well as the
>>>IO completion status?
>>
>>No. It provides an event record containing a user-defined state pointer
>>plus the IO readiness status change (different from IO completion status).
>>But that's what you need; you can do the call yourself.
>
> well, that means another syscall, which makes a footprint claim kind of moot,
> no?

What syscall? You call sys_epoll once for every thousand events or so,
then you call your handler, which does a write or whatever. No
extra syscall.

>>>>See http://www.kegel.com/c10k.html for an overview of the issue and some links.
>>>
>>>
>>>it's a great resource, except that for 700 clients, the difference
>>>between select, poll, epoll, aio are pretty moot. no?
>>
>>Depends on how close to maximal performance you need, and whether
>>you might later need to scale to more clients.
>
>
> no, I'm suggesting the choice is nonlinear: that for moderately large loads,
> like 700 clients, there is no advantage to traditional approaches.

I agree that for 700 clients the answer may be different than for 2000 clients.
However, if you have to handle 700 clients, how do you know you won't
have to handle 2000 later?

In any case, benchmarking's the only way to go. No amount of talk will
substitute for a good real-life measurement. That's what convinced
me that epoll was faster than sigio, and that sigio was
sometimes slower than select() !

And, for what it's worth, programmer productivity is sometimes
more important than all the above. I happen to work
at a place where performance is worth a lot of extra effort,
but other shops prefer to throw hardware at the problem and
not worry about that last 10%.

- Dan

--
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045

2003-01-24 08:08:40

by Mark Mielke

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On Fri, Jan 24, 2003 at 12:21:49AM -0800, Dan Kegel wrote:
> In any case, benchmarking's the only way to go. No amount of talk will
> substitute for a good real-life measurement. That's what convinced
> me that epoll was faster than sigio, and that sigio was
> sometimes slower than select() !

Also, anybody can write a poor implementation of each, so even
benchmarks are suspect...

My personal favourite model currently is switched I/O, but prioritized
threads per expected event frequency or event priority. For example,
events that won't likely occur for some time, or have a low priority,
can all be pushed to a low priority thread. Not only does this keep
the operating system free to give the CPU's to higher priority
threads, but the higher priority threads have fewer resources to
manage, leading to more efficient operation. Also, event handling code
that may take some time to complete should be moved to its own thread
in a thread pool, allowing the dispatching to fully complete without
needing to actually execute all of the (expensive) handlers.

> And, for what it's worth, programmer productivity is sometimes
> more important than all the above. I happen to work
> at a place where performance is worth a lot of extra effort,
> but other shops prefer to throw hardware at the problem and
> not worry about that last 10%.

Definately an argument for the one thread per connection model. :-)

mark

--
[email protected]/[email protected]/[email protected] __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

2003-01-24 22:43:47

by Corey Minyard

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Mark Mielke wrote:

>>And, for what it's worth, programmer productivity is sometimes
>>more important than all the above. I happen to work
>>at a place where performance is worth a lot of extra effort,
>>but other shops prefer to throw hardware at the problem and
>>not worry about that last 10%.
>>
>>
>
>Definately an argument for the one thread per connection model. :-)
>
I would disagree. One thread per connection is easier to conceptually
understand. In my experience, an event-driven model (which is what you
end up with if you use one or a few threads) is actually easier to
correctly implement and it tends to make your code more modular and
portable.

-Corey

2003-01-24 23:12:01

by Matti Aarnio

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On Fri, Jan 24, 2003 at 04:53:46PM -0600, Corey Minyard wrote:
...
> I would disagree. One thread per connection is easier to conceptually
> understand. In my experience, an event-driven model (which is what you
> end up with if you use one or a few threads) is actually easier to
> correctly implement and it tends to make your code more modular and
> portable.

An old thing from early annals of computer science (I browsed Knuth's
"The Art" again..) is called Coroutine.

Gives you "one thread per connection" programming model, but without
actual multiple scheduling threads in the kernel side.

Simplest coroutine implementations are truly simple.. Pagefull of C.
Knuth shows it with very few MIX (assembly) instructions.

Throwing in non-blocking socket/filedescriptor access, and in event
of "EAGAIN", coroutine-yielding to some other coroutine, does complicate
things, naturally.

Good coder finds balance in between various methods, possibly uses
both coroutine "userspace threads", and actual kernel threads.

Doing coroutine library all in portable C (by means of setjmp()/longjmp())
is possible, but not very efficient. A bit of assembly helps a lot.

> -Corey

/Matti Aarnio

2003-01-24 23:30:03

by Randy.Dunlap

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On Sat, 25 Jan 2003, Matti Aarnio wrote:

| On Fri, Jan 24, 2003 at 04:53:46PM -0600, Corey Minyard wrote:
| ...
| > I would disagree. One thread per connection is easier to conceptually
| > understand. In my experience, an event-driven model (which is what you
| > end up with if you use one or a few threads) is actually easier to
| > correctly implement and it tends to make your code more modular and
| > portable.
|
| An old thing from early annals of computer science (I browsed Knuth's
| "The Art" again..) is called Coroutine.
|
| Gives you "one thread per connection" programming model, but without
| actual multiple scheduling threads in the kernel side.
|
| Simplest coroutine implementations are truly simple.. Pagefull of C.
| Knuth shows it with very few MIX (assembly) instructions.
|
| Throwing in non-blocking socket/filedescriptor access, and in event
| of "EAGAIN", coroutine-yielding to some other coroutine, does complicate
| things, naturally.
|
| Good coder finds balance in between various methods, possibly uses
| both coroutine "userspace threads", and actual kernel threads.
|
| Doing coroutine library all in portable C (by means of setjmp()/longjmp())
| is possible, but not very efficient. A bit of assembly helps a lot.
|
| > -Corey
|
| /Matti Aarnio
| -

Davide Libenzi (epoll) likes and discusses coroutines on one of his
web pages: http://www.xmailserver.org/linux-patches/nio-improve.html
(search for /coroutine/)

--
~Randy

2003-01-24 23:39:07

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Randy.Dunlap wrote:
> On Sat, 25 Jan 2003, Matti Aarnio wrote:
>
> | On Fri, Jan 24, 2003 at 04:53:46PM -0600, Corey Minyard wrote:
> | ...
> | > I would disagree. One thread per connection is easier to conceptually
> | > understand. In my experience, an event-driven model (which is what you
> | > end up with if you use one or a few threads) is actually easier to
> | > correctly implement and it tends to make your code more modular and
> | > portable.
> |
> | An old thing from early annals of computer science (I browsed Knuth's
> | "The Art" again..) is called Coroutine.
> |
> | Gives you "one thread per connection" programming model, but without
> | actual multiple scheduling threads in the kernel side. ...
> | Doing coroutine library all in portable C (by means of setjmp()/longjmp())
> | is possible, but not very efficient. A bit of assembly helps a lot.

There's also an elegant implementation that uses switch statements
or computed gotos; see http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
I'm using it. It's a bit limited, but hey, it works for me.

> Davide Libenzi (epoll) likes and discusses coroutines on one of his
> web pages: http://www.xmailserver.org/linux-patches/nio-improve.html
> (search for /coroutine/)

IMHO coroutines are harder to use than either threads or nonblocking I/O.
Then again, I don't like Scheme; many things in this world are a matter of taste.
- Dan

--
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045

2003-01-24 23:52:18

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Mark Hahn wrote:
>>>>>does epoll provide a thunk (callback and state variable) as well as the
>>>>>IO completion status?
>>>>
>>>>No. It provides an event record containing a user-defined state pointer
>>>>plus the IO readiness status change (different from IO completion status).
>>>>But that's what you need; you can do the call yourself.
>>>
>>>well, that means another syscall, which makes a footprint claim kind of moot,
>>>no?
>>
>>What syscall? You call sys_epoll once for every thousand events or so,
>>then you call your handler, which does a write or whatever. No
>>extra syscall.
>
> before a client can be sent the next chunk, the IO status of the last
> chunk must be tested. with the simple blocking, thread-per-client approach,
> this happens automaticaly (write returns the number of bytes written).
>
> with epoll, don't you have to do a syscall to query the status of
> the just-completed IO?

Nope. Just go ahead and write. (Same as with poll(), except that
with epoll, you only get notified once.) Any errors are reported
immediately by write(), so there's no more status to get.
- Dan


--
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045

2003-01-27 09:39:12

by Terje Eggestad

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Apart from the argument already given on other replies, you should
keep in mind that you probably need to give priority to doing receive.
THat include your clients, but if you don't you run into the risk of
significantly limiting your bandwidth since the send queues around your
system fill up.

Try doing that with threads.


Actually I would recommend the approach c)

c) Write an asynchronous system with only 2 or three threads where I
manage the connections and keep the state of each connection in a data
structure.


On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> Thanks
> Lee
--
_________________________________________________________________________

Terje Eggestad mailto:[email protected]
Scali Scalable Linux Systems http://www.scali.com

Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________

2003-01-27 21:42:44

by Bill Davidsen

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On 27 Jan 2003, Terje Eggestad wrote:

> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up.
>
> Try doing that with threads.

Okay, I'm running my usenet exchange machines on Linux with Earthquake,
one thread per socket, 300-500 sockets, 700-800GB/day with incoming rate
spikes to 130Mbit on two 100Mbit NICs. What is it I'm supposed to try
doing with threads?

And if this is a webserver or anything like it, the incoming bandwidth is
probably orders of magnitude below the outgoing... Hum, like a usenet
reader server. Below, from a Linux box running Twister, also threaded per
feed in and per reader socket out.

load free buffs swap pgin pgou dk0 dk1 dk2 dk3 ipkt opkt int ctx usr sys idl i_netK o_netK
2.98 5.0 1807 0.0 544 2220 71 66 21 0 6173 3390 9600 17983 3 17 80 7170.8 941.9
4.77 4.5 1805 0.0 1117 6267 39 134 134 0 5403 3212 8780 20663 8 34 58 6645.4 978.9
2.35 4.3 1802 0.0 1529 6900 37 176 189 0 6134 3648 10007 18492 9 25 66 7470.4 1087.9
1.10 4.8 1800 0.0 1428 5609 33 149 150 0 5871 3447 9505 18028 9 25 66 7235.2 961.0
1.38 6.7 1798 0.0 970 6671 34 139 134 0 6250 3685 10051 20210 9 26 65 7503.4 1088.8
6.57 5.0 1797 0.0 1589 7673 89 184 188 0 5912 3571 9732 20165 8 33 59 7003.7 1169.3
2.30 4.6 1799 0.0 1648 5900 44 154 146 0 6539 3998 10660 17975 9 27 64 7631.0 1382.6

Forgive the formatting, it kind of break with larger numbers...

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-01-27 22:07:35

by Bill Davidsen

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On Thu, 23 Jan 2003, Lee Chin wrote:

> I am discussing with a few people on different approaches to solving a
> scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can a) launch 700 threads that each
> block on I/O to disk and to the client (in reading and writing on the
> socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I
> manage the connections and stack (via setcontext swapcontext etc), which
> is progromatically a little harder

There are many other ways, involving use of async io for disk and select
on some limited number of sockets per thread. If you want to wallow in
analysis paralysis you can certainly do it. Take a look at existing
usenet, mail, web and dns servers and you will see a number of ways to
attack this problem, and correctly implemented most of them work fine.

I believe Ingo mentioned some huge number of practical threads when he was
first talking about the latest thread library. If you believe it, or if
you really will be happy at 700 tasks per server, then thread per socket
is the easiest to implement, at least IMHO.

I'm using various news software which does most combinations of threading,
select, and even full processes per client, and none of them strike me as
being inherently better (as opposed to some being better implementations).
Ask Ingo how many threads you can really run in six months when the new
kernel and thread bits are more stable, that's the only scaling bit I
can't even guess. Pick one method, write code. I believe implementation
will be more important than method, unless you make a *really* bad choice.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-01-29 17:17:26

by Lee Chin

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Today I do method (C)... but many people seem to say that, hey, pthreads does almost just that with a constant memory overhead of remembering the stack per blocking thread... so there is no time difference, just that pthreads consumes slightly more memory. That is the issue I am trying to get my head around.

That particular question, no one has answered... in Linux, the scheduler will not go around crazy trying to schedule prcosses that are all waiting on IO. NOw the only time I see a degrade in threads would be if all are runnable.... in that case a async scheme with two threads would let each task run to completion, not thrashing the kernel. Is that correct to say?
----- Original Message -----
From: Terje Eggestad <[email protected]>
Date: 27 Jan 2003 10:48:22 +0100
To: Lee Chin <[email protected]>
Subject: Re: debate on 700 threads vs asynchronous code

> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up.
>
> Try doing that with threads.
>
>
> Actually I would recommend the approach c)
>
> c) Write an asynchronous system with only 2 or three threads where I
> manage the connections and keep the state of each connection in a data
> structure.
>
>
> On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > Hi
> > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> >
> > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> >
> > Now, to cater to 700 clients, I can
> > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> >
> > OR
> >
> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> >
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> >
> > Thanks
> > Lee
> --
> _________________________________________________________________________
>
> Terje Eggestad mailto:[email protected]
> Scali Scalable Linux Systems http://www.scali.com
>
> Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
> P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
> N-0619 Oslo fax: +47 22 62 89 51
> NORWAY
> _________________________________________________________________________
>

--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

2003-01-29 21:30:05

by Dan Kegel

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

Lee Chin wrote:
> Terje Eggestad <[email protected]> wrote:
>> Apart from the argument already given on other replies, you should
>> keep in mind that you probably need to give priority to doing receive.
>> THat include your clients, but if you don't you run into the risk of
>> significantly limiting your bandwidth since the send queues around your
>> system fill up.
>>
>> Try doing that with threads.
>>
>> Actually I would recommend the approach c)
>>
>> c) Write an asynchronous system with only 2 or three threads where I
>> manage the connections and keep the state of each connection in a data
>> structure.
>
> Today I do method (C)... but many people seem to say that, hey, pthreads does almost
> just that with a constant memory overhead of remembering the stack per blocking
> thread... so there is no time difference, just that pthreads consumes slightly more
> memory. That is the issue I am trying to get my head around.

The best way to get your head around it is to
benchmark both approaches, and spend some time
refining your implementation of each so you
understand where the bottlenecks are.

> That particular question, no one has answered... in Linux, the scheduler will not go
> around crazy trying to schedule prcosses that are all waiting on IO. NOw the only
> time I see a degrade in threads would be if all are runnable.... in that case a async
> scheme with two threads would let each task run to completion, not thrashing the
> kernel. Is that correct to say?

There are lots of other issues, too.
Talk is cheap and fun, but only coding will give the real answer.
Go forth and code...

- Dan






2003-01-30 09:27:26

by Terje Eggestad

[permalink] [raw]
Subject: Re: debate on 700 threads vs asynchronous code

On ons, 2003-01-29 at 18:26, Lee Chin wrote:
> Today I do method (C)... but many people seem to say that, hey,
> pthreads does almost just that with a constant memory overhead of
> remembering the stack per blocking thread... so there is no time
> difference, just that pthreads consumes slightly more memory. That is
> the issue I am trying to get my head around.
>
> That particular question, no one has answered... in Linux, the
> scheduler will not go around crazy trying to schedule prcosses that
> are all waiting on IO. NOw the only time I see a degrade in threads
> would be if all are runnable.... in that case a async scheme with two
> threads would let each task run to completion, not thrashing the
> kernel. Is that correct to say?


Yes

And you can add that if you have many runnable threads, there will be an
extra overhead doing context switching.


> ----- Original Message -----
> From: Terje Eggestad <[email protected]>
> Date: 27 Jan 2003 10:48:22 +0100
> To: Lee Chin <[email protected]>
> Subject: Re: debate on 700 threads vs asynchronous code
>
> > Apart from the argument already given on other replies, you should
> > keep in mind that you probably need to give priority to doing receive.
> > THat include your clients, but if you don't you run into the risk of
> > significantly limiting your bandwidth since the send queues around your
> > system fill up.
> >
> > Try doing that with threads.
> >
> >
> > Actually I would recommend the approach c)
> >
> > c) Write an asynchronous system with only 2 or three threads where I
> > manage the connections and keep the state of each connection in a data
> > structure.
> >
> >
> > On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > > Hi
> > > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> > >
> > > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> > >
> > > Now, to cater to 700 clients, I can
> > > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> > >
> > > OR
> > >
> > > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> > >
> > > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> > >
> > > Thanks
> > > Lee
> > --
> > _________________________________________________________________________
> >
> > Terje Eggestad mailto:[email protected]
> > Scali Scalable Linux Systems http://www.scali.com
> >
> > Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
> > P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
> > N-0619 Oslo fax: +47 22 62 89 51
> > NORWAY
> > _________________________________________________________________________
> >
--
_________________________________________________________________________

Terje Eggestad mailto:[email protected]
Scali Scalable Linux Systems http://www.scali.com

Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________