2002-06-13 08:14:37

by Roberto Fichera

[permalink] [raw]
Subject: Developing multi-threading applications

Hi All,

I'm designing a multithreding application with many threads,
from ~100 to 300/400. I need to take some decisions about
which threading library use, and which patch I need for the
kernel to improve the scheduler performances. The machines
will be a SMP Xeon with 4/8 processors with 4Gb RAM.
All threads are almost computational intensive and the library
need a fast interprocess comunication and syncronization
because there are many sync & async threads time
dependent and/or critical. I'm planning, in the future, to distribuite
all the threads in a pool of SMP box.

Thanks in advance.

Roberto Fichera.


2002-06-17 08:18:03

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 12.30 15/06/02 +0200, Ingo Oeser wrote:

>On Sat, Jun 15, 2002 at 11:01:44AM +0200, Roberto Fichera wrote:
> > > Even if that's true, and it's often not, how many different
> types
> > > of data
> > >acquisition can you have? Ten? Twenty? That's a far cry from 300.
> >
> > Currently are 190! Always active are ~110! So thinking by separating
> I/O from
> > the computation we double the threads.
>
>So basically you are just traversing your data depedency graph
>wrongly. Do a level order traversion if it is a dependency forest
>or an breadth first traversion if not.

Ok! I've semplified too much ;-)!

>If this node require IO -> schedule the IO and return back to the upper
>level noticing it, that you like to be woken, if the IO is
>finished.
>
>If this node require Computation -> do it, if this CPU is the one with
>lowest load, else schedule it for the CPU with lowest load.

How can I do it ? Shouldn't be a kernel problem ? I could collect
a various patch around that implement a CPU process bind/affinity and
CPU load balance but how can I determine which CPU have the lowest
load in a given time ?

>Continue with next node.
>
>(load is meant "number of compuations with same metric scheduled
>on this thread")
>
>Use only one thread per CPU. Try to make the IO-Waiting as unique
>as possible (poll would be perfect).

This could be implemented by the process affinity to bind the
process to a CPU. But I continue to not hunderstand why
I must have only one thread per CPU. There is some URL
where can I see some kernel/sched/vm/I-O/other-think graph about
this point ?


>So this is all doable, once you analyze your data dependency
>graph properly and make the simulation data driven (which it
>usally is).
>
>Regards
>
>Ingo Oeser
>--
>Science is what we can tell a computer. Art is everything else. --- D.E.Knuth

Roberto Fichera.

2002-06-17 16:07:55

by Marco Colombo

[permalink] [raw]
Subject: Re: Developing multi-threading applications

On Mon, 17 Jun 2002, Roberto Fichera wrote:

[...]
> process to a CPU. But I continue to not hunderstand why
> I must have only one thread per CPU. There is some URL
> where can I see some kernel/sched/vm/I-O/other-think graph about
> this point ?

To put it simply, because you have only one PC per CPU. It's not
really an OS thing.

Every time you're saving the PC (and SP, and all the "thread context")
you're "emulating" more CPUs on just one. And what you got is just...
an emulation. A Thread is an execution abstraction, and a CPU is an
execution actor. Sounds sensible to match the two. Use functions instead
to group instructions by their (functional) meaning.

It makes much more sense, on 4-ways system, to have 4 rather complex
threads that are able to execute different functions, like in
a data-driven or event-driven model, than to run 400 simpler threads
which implement one function each, IMHO.

.TM.


2002-06-17 18:00:29

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 18.07 17/06/02 +0200, Marco Colombo wrote:

>On Mon, 17 Jun 2002, Roberto Fichera wrote:
>
>[...]
> > process to a CPU. But I continue to not hunderstand why
> > I must have only one thread per CPU. There is some URL
> > where can I see some kernel/sched/vm/I-O/other-think graph about
> > this point ?
>
>To put it simply, because you have only one PC per CPU. It's not
>really an OS thing.
>
>Every time you're saving the PC (and SP, and all the "thread context")
>you're "emulating" more CPUs on just one. And what you got is just...
>an emulation. A Thread is an execution abstraction, and a CPU is an
>execution actor. Sounds sensible to match the two. Use functions instead
>to group instructions by their (functional) meaning.

Yes! I know ;-)!

>It makes much more sense, on 4-ways system, to have 4 rather complex
>threads that are able to execute different functions, like in
>a data-driven or event-driven model, than to run 400 simpler threads
>which implement one function each, IMHO.

To make it simple, I'll try the 2 solutions!


>.TM.

Roberto Fichera.

2002-06-17 18:55:34

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: Developing multi-threading applications

On Mon, Jun 17, 2002 at 06:07:51PM +0200, Marco Colombo wrote:
> On Mon, 17 Jun 2002, Roberto Fichera wrote:
>
> [...]
> > process to a CPU. But I continue to not hunderstand why
> > I must have only one thread per CPU. There is some URL
> > where can I see some kernel/sched/vm/I-O/other-think graph about
> > this point ?
>
> To put it simply, because you have only one PC per CPU. It's not
> really an OS thing.
>
> Every time you're saving the PC (and SP, and all the "thread context")
> you're "emulating" more CPUs on just one. And what you got is just...
> an emulation. A Thread is an execution abstraction, and a CPU is an
> execution actor. Sounds sensible to match the two. Use functions instead
> to group instructions by their (functional) meaning.

It is common to use many threads per processor on some operating
systems. But this is (in my experience) because of the lack of proper
non-blocking APIs on said OS.

You can emulate non-blocking APIs with threads and a blocking API. And
on some systems you simply have to.

On GNU/Linux this is not generally a problem. And as Marco said, you
really shouldn't have to do that.

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2002-06-13 08:27:01

by David Schwartz

[permalink] [raw]
Subject: Re: Developing multi-threading applications


On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote:

>I'm designing a multithreding application with many threads,
>from ~100 to 300/400. I need to take some decisions about
>which threading library use, and which patch I need for the
>kernel to improve the scheduler performances. The machines
>will be a SMP Xeon with 4/8 processors with 4Gb RAM.
>All threads are almost computational intensive and the library
>need a fast interprocess comunication and syncronization
>because there are many sync & async threads time
>dependent and/or critical. I'm planning, in the future, to distribuite
>all the threads in a pool of SMP box.

With 4/8 processors, you don't want to create 100-400 threads doing
computation intensive tasks. So redesign things so that the number of threads
you create is more in line with the number of CPUs you have available. That
is, use a 'thread per CPU' (or slightly more threads than their are CPUs per
node) approach and you'll perform a lot better. Distribute the available work
over the available threads.

DS


2002-06-13 09:08:27

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 01.26 13/06/02 -0700, you wrote:

>On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote:
>
> >I'm designing a multithreding application with many threads,
> >from ~100 to 300/400. I need to take some decisions about
> >which threading library use, and which patch I need for the
> >kernel to improve the scheduler performances. The machines
> >will be a SMP Xeon with 4/8 processors with 4Gb RAM.
> >All threads are almost computational intensive and the library
> >need a fast interprocess comunication and syncronization
> >because there are many sync & async threads time
> >dependent and/or critical. I'm planning, in the future, to distribuite
> >all the threads in a pool of SMP box.
>
> With 4/8 processors, you don't want to create 100-400 threads doing
>computation intensive tasks. So redesign things so that the number of threads
>you create is more in line with the number of CPUs you have available. That
>is, use a 'thread per CPU' (or slightly more threads than their are CPUs per
>node) approach and you'll perform a lot better. Distribute the available work
>over the available threads.

You are right! But "computational intensive" is not totaly right as I say ;-),
because most of thread are waiting for I/O, after I/O are performed the
computational intensive tasks, finished its work all the result are sent
to thread-father, the father collect all the child's result and perform some
computational work and send its result to its father and so on with many
thread-father controlling other child. So I think the main problem/overhead
is thread creation and the thread's numbers.


> DS

Roberto Fichera.

2002-06-13 09:42:42

by Peter Wächtler

[permalink] [raw]
Subject: Re: Developing multi-threading applications

Roberto Fichera wrote:
> At 01.26 13/06/02 -0700, you wrote:
>
>> On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote:
>>
>> >I'm designing a multithreding application with many threads,
>> >from ~100 to 300/400. I need to take some decisions about
>> >which threading library use, and which patch I need for the
>> >kernel to improve the scheduler performances. The machines
>> >will be a SMP Xeon with 4/8 processors with 4Gb RAM.
>> >All threads are almost computational intensive and the library
>> >need a fast interprocess comunication and syncronization
>> >because there are many sync & async threads time
>> >dependent and/or critical. I'm planning, in the future, to distribuite
>> >all the threads in a pool of SMP box.
>>
>> With 4/8 processors, you don't want to create 100-400 threads
>> doing
>> computation intensive tasks. So redesign things so that the number of
>> threads
>> you create is more in line with the number of CPUs you have available.
>> That
>> is, use a 'thread per CPU' (or slightly more threads than their are
>> CPUs per
>> node) approach and you'll perform a lot better. Distribute the
>> available work
>> over the available threads.
>
>
> You are right! But "computational intensive" is not totaly right as I
> say ;-),
> because most of thread are waiting for I/O, after I/O are performed the
> computational intensive tasks, finished its work all the result are sent
> to thread-father, the father collect all the child's result and perform
> some
> computational work and send its result to its father and so on with many
> thread-father controlling other child. So I think the main problem/overhead
> is thread creation and the thread's numbers.
>

Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/

they provide M:N threading model where threads can live in userspace.


2002-06-13 09:52:48

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 11.44 13/06/02 +0200, Peter W?chtler wrote:

>>You are right! But "computational intensive" is not totaly right as I say
>>;-),
>>because most of thread are waiting for I/O, after I/O are performed the
>>computational intensive tasks, finished its work all the result are sent
>>to thread-father, the father collect all the child's result and perform some
>>computational work and send its result to its father and so on with many
>>thread-father controlling other child. So I think the main problem/overhead
>>is thread creation and the thread's numbers.
>
>Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/
>
>they provide M:N threading model where threads can live in userspace.

Yes! I'm looking for it. But I want evaluate some other before.

Roberto Fichera.

2002-06-13 10:15:01

by Peter Wächtler

[permalink] [raw]
Subject: Re: Developing multi-threading applications

Roberto Fichera wrote:
> At 11.44 13/06/02 +0200, Peter W?chtler wrote:
>
>>> You are right! But "computational intensive" is not totaly right as I
>>> say ;-),
>>> because most of thread are waiting for I/O, after I/O are performed the
>>> computational intensive tasks, finished its work all the result are sent
>>> to thread-father, the father collect all the child's result and
>>> perform some
>>> computational work and send its result to its father and so on with many
>>> thread-father controlling other child. So I think the main
>>> problem/overhead
>>> is thread creation and the thread's numbers.
>>
>>
>> Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/
>>
>> they provide M:N threading model where threads can live in userspace.
>
>
> Yes! I'm looking for it. But I want evaluate some other before.
>

There is a paper rse-pmt.ps included in the tar archives from Ralf Engelschall
(author of GNU portable threads).

There you will find lots of interesting pointers to other thread packages.


2002-06-13 10:13:44

by David Schwartz

[permalink] [raw]
Subject: Re: Developing multi-threading applications


On Thu, 13 Jun 2002 11:08:27 +0200, Roberto Fichera wrote:
>You are right! But "computational intensive" is not totaly right as I say ;-
>),

It's really not fair to change the premises in the middle of an argument.

>because most of thread are waiting for I/O,

Still wrong. You don't tie up threads waiting for I/O. You can wait without
having a thread doing the waiting.

>after I/O are performed the
>computational intensive tasks, finished its work all the result are sent
>to thread-father,

Okay, so you need a new abstraction -- separate the waiting from the
working. Create as many threads to do the work as you have processors to do
the work on. As for the waiting, minimize threads waiting, they're pure
overhead. If it's sockets, use 'poll' so one thread can do lots of waiting.

>the father collect all the child's result and perform some
>computational work and send its result to its father and so on with many
>thread-father controlling other child. So I think the main problem/overhead
>is thread creation and the thread's numbers.

So get rid of the problem! Don't create so many threads, create only as many
threads as can do useful work and reuse them rather than destroying and
recreating them. Solve the actual problem/overhead since it's totally
artificial and due to your model rather than your problem!

DS


2002-06-13 10:25:12

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 11.31 13/06/02 +0200, Ingo Oeser wrote:

>On Thu, Jun 13, 2002 at 11:08:27AM +0200, Roberto Fichera wrote:
> > You are right! But "computational intensive" is not totaly right as I
> say ;-),
> > because most of thread are waiting for I/O, after I/O are performed the
> > computational intensive tasks, finished its work all the result are sent
> > to thread-father, the father collect all the child's result and perform
> some
> > computational work and send its result to its father and so on with many
> > thread-father controlling other child. So I think the main problem/overhead
> > is thread creation and the thread's numbers.
>
>So you are creating a simulation/emulation application/engine, right?
>Or a measured data analysis engine? (which is basically the same
>task)

Yes! It's a simulation/emulation application.

>For these kind of tasks creating your own kind of "threads" is
>probably better.
>
>Split it in the following data structure:
>
>struct my_thread {
> actor_function_t actor;
> input_t inbuf;
> output_t outbuf;
> state_t statebuf;
>}
>
>And provide rules and primitives for accessing inbuf/outbuf, if
>they might be shared (which is probable).

This can be a solution.


>Now you can build a dependency tree/graph for the whole stuff
>easily and schedule works of the same level to some real worker
>threads (which might be on different machines), which are one per CPU.
>
>The problem is to build the actor as a REAL primitive, that
>scales only by the size of inbuf and not by the contents of it.

Yes!

>Everything else is going to be bloated and not really scalable,
>but can be implemented by every "Joe Programmer" after finishing
>high school ;-)

Depending by the threading library, if it's totaly userspace or not!
With so many thread that aren't totaly userspace the scheduler
performances/caratteristics are much important. I prefer a mixed
solution for example. Because some problem can be easily resolved
with a userspace threads and other not.


>Regards
>
>Ingo Oeser
>--
>Science is what we can tell a computer. Art is everything else. --- D.E.Knuth

Roberto Fichera.

2002-06-13 10:42:42

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 12.16 13/06/02 +0200, Peter W?chtler wrote:
>Roberto Fichera wrote:
>>At 11.44 13/06/02 +0200, Peter W?chtler wrote:
>>
>>>>You are right! But "computational intensive" is not totaly right as I
>>>>say ;-),
>>>>because most of thread are waiting for I/O, after I/O are performed the
>>>>computational intensive tasks, finished its work all the result are sent
>>>>to thread-father, the father collect all the child's result and perform
>>>>some
>>>>computational work and send its result to its father and so on with many
>>>>thread-father controlling other child. So I think the main problem/overhead
>>>>is thread creation and the thread's numbers.
>>>
>>>
>>>Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/
>>>
>>>they provide M:N threading model where threads can live in userspace.
>>
>>Yes! I'm looking for it. But I want evaluate some other before.

And I don't want use a library that's totally in userspace.


>There is a paper rse-pmt.ps included in the tar archives from Ralf Engelschall
>(author of GNU portable threads).
>
>There you will find lots of interesting pointers to other thread packages.

I'll take a look. Thanks!



>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/

Roberto Fichera.

2002-06-13 11:21:53

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 03.13 13/06/02 -0700, you wrote:


>On Thu, 13 Jun 2002 11:08:27 +0200, Roberto Fichera wrote:
> >You are right! But "computational intensive" is not totaly right as I say ;-
> >),
>
> It's really not fair to change the premises in the middle of an
> argument.

Sorry ;-)!


> >because most of thread are waiting for I/O,
>
> Still wrong. You don't tie up threads waiting for I/O. You can
> wait without
>having a thread doing the waiting.
>
> >after I/O are performed the
> >computational intensive tasks, finished its work all the result are sent
> >to thread-father,
>
> Okay, so you need a new abstraction -- separate the waiting from the
>working. Create as many threads to do the work as you have processors to do
>the work on. As for the waiting, minimize threads waiting, they're pure
>overhead. If it's sockets, use 'poll' so one thread can do lots of waiting.

This's a possible solution.

> >the father collect all the child's result and perform some
> >computational work and send its result to its father and so on with many
> >thread-father controlling other child. So I think the main problem/overhead
> >is thread creation and the thread's numbers.
>
> So get rid of the problem! Don't create so many threads, create
> only as many
>threads as can do useful work and reuse them rather than destroying and
>recreating them. Solve the actual problem/overhead since it's totally
>artificial and due to your model rather than your problem!

Depending by the applications. With my simulation/emulation program I need
to create
many thread because each thread resolve/manage/compute a specific problem and
it's live depend by some factors. Each thread is create only if needed to
avoid the
overhead. The simulation/emulation is a "merge" of many and many object,
each object
work to resolve/manage/compute a specific problem. All the low objects are
grouped to
resolve a specific problem and are managed by a thread controller that
should take some
decision or doing some work. Some thread controller are grouped and managed
by another
thread controller and so on. Do not think that I need always 400 threads
active they are
create only if need by the controller. You must thinks this
simulation/emulation as collection
of many and many object that should interoperate, and the model is designed
to scale easily
on a distribuite environment.


> DS

Roberto Fichera.

2002-06-13 11:58:10

by David Schwartz

[permalink] [raw]
Subject: Re: Developing multi-threading applications


>Depending by the applications. With my simulation/emulation program I need
>to create
>many thread because each thread resolve/manage/compute a specific problem
and
>it's live depend by some factors. Each thread is create only if needed to
>avoid the
>overhead. The simulation/emulation is a "merge" of many and many object,
>each object
>work to resolve/manage/compute a specific problem. All the low objects are
>grouped to
>resolve a specific problem and are managed by a thread controller that
>should take some
>decision or doing some work. Some thread controller are grouped and managed
>by another
>thread controller and so on. Do not think that I need always 400 threads
>active they are
>create only if need by the controller. You must thinks this
>simulation/emulation as collection
>of many and many object that should interoperate, and the model is designed
>to scale easily
>on a distribuite environment.

If it's a simulation, you don't *really* need the threads, you just need to
be able to act as if you had them. After all, what are you simulating if what
work gets done when is up to the random vagaries of the OS scheduler?

If it's a real application wanting real performance, the suggestions I made
stand -- you don't want many more threads working than you have CPUs and you
don't want a lot of threads sitting around waiting for work (and thus forcing
bazillions of extra context switches).

It sounds to me like your design is broken, needlessly mapping threads to
I/Os that are being waited for one-to-one. This is a common error among
programmers who consciously or subconsciously have accepted the 'more threads
can do more work' philosophy.

What you need to do is take whatever it is you're thinking of as a 'thread'
right now, which I'd roughly define as 'one logical task, from start to
completion' and realize that there is absolutely no reason to map this
one-to-one to actual pthreads threads and every reason in the world not to.

This will conserve resources (12 thread stacks instead of 300, 12 KSEs
instead of 300), reduce context switches (context switches will only occur
when there's no work to do at all or a thread uses up its entire timeslice
rather than every time we change which client/task we're doing work for/on),
improve scheduler efficiency (because the number of ready threads will not
exceed the number of CPUs by much) and more often than not, clean up a lot of
ugliness in your architecture (because threads are probably being used
instead of a sane abstraction for 'work to be done' or 'a client I'm doing
work for').

DS


2002-06-13 16:26:51

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 04.58 13/06/02 -0700, David Schwartz wrote:

> If it's a simulation, you don't *really* need the threads, you
> just need to
>be able to act as if you had them. After all, what are you simulating if what
>work gets done when is up to the random vagaries of the OS scheduler?
>
> If it's a real application wanting real performance, the
> suggestions I made
>stand -- you don't want many more threads working than you have CPUs and you
>don't want a lot of threads sitting around waiting for work (and thus forcing
>bazillions of extra context switches).

This is a scheduler problem! All threads waiting for I/O are blocked by
the scheduler, and this doesn't have any impact for the context switches
it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece
of code, it should make a big difference for example.

> It sounds to me like your design is broken, needlessly mapping
> threads to
>I/Os that are being waited for one-to-one. This is a common error among
>programmers who consciously or subconsciously have accepted the 'more threads
>can do more work' philosophy.

I don't think "more threads == more work done"! With the thread's approch it's
possible to split a big sequential program in a variety of concurrent logical
programs with a big win for code revisions and new implementation.

> What you need to do is take whatever it is you're thinking of as
> a 'thread'
>right now, which I'd roughly define as 'one logical task, from start to
>completion' and realize that there is absolutely no reason to map this
>one-to-one to actual pthreads threads and every reason in the world not to.
>
> This will conserve resources (12 thread stacks instead of 300, 12
> KSEs
>instead of 300), reduce context switches (context switches will only occur
>when there's no work to do at all or a thread uses up its entire timeslice
>rather than every time we change which client/task we're doing work for/on),
>improve scheduler efficiency (because the number of ready threads will not
>exceed the number of CPUs by much) and more often than not, clean up a lot of
>ugliness in your architecture (because threads are probably being used
>instead of a sane abstraction for 'work to be done' or 'a client I'm doing
>work for').

You are right! But depend by the application! If you have todo I/O like
signal acquisition,
sensors acquisitions and so on, you must have a one thread for each type of
data acquisition,
you must have a thread that perform some data computation with a subset,
for examples,
of this data, and generate the output that could be a new input for an
other thread.
This make the environment more realistic. I agree with you that if we
increase the thread's
numbers the system could collapse (= context switches become expensive = we
must increase
the CPU numbers or new box is required or new approch should be make).


Roberto Fichera.

2002-06-14 20:56:04

by David Schwartz

[permalink] [raw]
Subject: Re: Developing multi-threading applications


On Thu, 13 Jun 2002 18:26:54 +0200, Roberto Fichera wrote:
>At 04.58 13/06/02 -0700, David Schwartz wrote:

>This is a scheduler problem! All threads waiting for I/O are blocked by
>the scheduler, and this doesn't have any impact for the context switches
>it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece
>of code, it should make a big difference for example.

You are incorrect. If you have ten threads each waiting for an I/O and all
ten I/Os are ready, then ten context switches are needed. If you have one
thread waiting for ten I/Os, and then I/Os come ready, one context switch is
needed.

[snip]

>I don't think "more threads == more work done"! With the thread's approch
>it's
>possible to split a big sequential program in a variety of concurrent
>logical
>programs with a big win for code revisions and new implementation.

I'm not advising eliminating the threads approach. I'm only advising not
using threads as your abstraction for clients or work to be done. Use threads
as the execution vehicles that pick up work when there's work to be done.
(Think thread pools, think separating I/O from computation.)

[snip]
>You are right! But depend by the application! If you have todo I/O like
>signal acquisition,
>sensors acquisitions and so on, you must have a one thread for each type of
>data acquisition,

Even if that's true, and it's often not, how many different types of data
acquisition can you have? Ten? Twenty? That's a far cry from 300.

DS


2002-06-15 09:01:46

by Roberto Fichera

[permalink] [raw]
Subject: Re: Developing multi-threading applications

At 13.56 14/06/02 -0700, David Schwartz wrote:


>On Thu, 13 Jun 2002 18:26:54 +0200, Roberto Fichera wrote:
> >At 04.58 13/06/02 -0700, David Schwartz wrote:
>
> >This is a scheduler problem! All threads waiting for I/O are blocked by
> >the scheduler, and this doesn't have any impact for the context switches
> >it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece
> >of code, it should make a big difference for example.
>
> You are incorrect. If you have ten threads each waiting for an
> I/O and all
>ten I/Os are ready, then ten context switches are needed. If you have one
>thread waiting for ten I/Os, and then I/Os come ready, one context switch is
>needed.

You are right with this specific case, but always depending what kind of I/O
you must be done. Not all the case could be reduce to your logic, only a
specific case. It's a only "local" optimization.

>[snip]
>
> >I don't think "more threads == more work done"! With the thread's approch
> >it's
> >possible to split a big sequential program in a variety of concurrent
> >logical
> >programs with a big win for code revisions and new implementation.
>
> I'm not advising eliminating the threads approach. I'm only
> advising not
>using threads as your abstraction for clients or work to be done. Use threads
>as the execution vehicles that pick up work when there's work to be done.
>(Think thread pools, think separating I/O from computation.)

Yes! This is what I want!

>[snip]
> >You are right! But depend by the application! If you have todo I/O like
> >signal acquisition,
> >sensors acquisitions and so on, you must have a one thread for each type of
> >data acquisition,
>
> Even if that's true, and it's often not, how many different types
> of data
>acquisition can you have? Ten? Twenty? That's a far cry from 300.

Currently are 190! Always active are ~110! So thinking by separating I/O from
the computation we double the threads.

Roberto Fichera.

2002-06-15 10:57:21

by Ingo Oeser

[permalink] [raw]
Subject: Re: Developing multi-threading applications

On Sat, Jun 15, 2002 at 11:01:44AM +0200, Roberto Fichera wrote:
> > Even if that's true, and it's often not, how many different types
> > of data
> >acquisition can you have? Ten? Twenty? That's a far cry from 300.
>
> Currently are 190! Always active are ~110! So thinking by separating I/O from
> the computation we double the threads.

So basically you are just traversing your data depedency graph
wrongly. Do a level order traversion if it is a dependency forest
or an breadth first traversion if not.

If this node require IO -> schedule the IO and return back to the upper
level noticing it, that you like to be woken, if the IO is
finished.

If this node require Computation -> do it, if this CPU is the one with
lowest load, else schedule it for the CPU with lowest load.

Continue with next node.

(load is meant "number of compuations with same metric scheduled
on this thread")

Use only one thread per CPU. Try to make the IO-Waiting as unique
as possible (poll would be perfect).


So this is all doable, once you analyze your data dependency
graph properly and make the simulation data driven (which it
usally is).

Regards

Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth