I have a very interesting question about something that we're seeing
happening with threading between Fedora Core 3 and Fedora Core 5. Running
on Dell PowerEdge 1800 Hardware with a Xeon processor with hyper-threading
turned on. Both systems are using a 2.6.16.16 kernel (MVP al la special).
We have a multithreaded application that starts two worker threads. On
Fedora Core 3 both of these we use getpid() to get the PID of the thread and
then use set_afinity to assign each thread to it's own CPU. Both threads
run almost symmetrically even on their given CPU watching the system
monitor.
On Fedora Core 5 with whatever new threading mechanism is being used, getpid
no longer works on threads, it returns the same PID as the parent program.
So we're using _pthread_self to get a u_long thread id back. However,
set_afinity doesn't accept that, it wants a real PID. So we're leaving it
to the system to schedule the threads between the CPUS.
On FC3 the threads run on 2 CPUS in symmetry and almost in parallel.
However, the problem is on FC5 it doesn't work like that. We're seeing the
threading is almost more serial, where one thread will run on CPU1 at 100%
then as it's finishing and the CPU utilization is coming down, thread two is
coming up to 100% on CPU2 and they're ping ponging back and forth ... Which
is costing us a lot of time!
What am I missing? What do I need to do in FC5 or the kernel or the
threading library to get my threads to run in symmetric parallel again???
Thanks!
--
-brian
Brian McGrew { [email protected] || [email protected] }
--
> With hope comes chance,
with chance comes destiny,
with destiny comes fate,
And with fate comes the courtesy flush!
On Fri, 2007-01-19 at 10:43 -0800, Brian McGrew wrote:
> I have a very interesting question about something that we're seeing
> happening with threading between Fedora Core 3 and Fedora Core 5. Running
> on Dell PowerEdge 1800 Hardware with a Xeon processor with hyper-threading
> turned on. Both systems are using a 2.6.16.16 kernel (MVP al la special).
>
> We have a multithreaded application that starts two worker threads. On
> Fedora Core 3 both of these we use getpid() to get the PID of the thread and
> then use set_afinity to assign each thread to it's own CPU. Both threads
> run almost symmetrically even on their given CPU watching the system
> monitor.
this is odd; even in FC3 getpid() is supposed to return the process ID
not the thread ID
> What am I missing? What do I need to do in FC5 or the kernel or the
> threading library to get my threads to run in symmetric parallel again???
you should fix the app to use something like pthread_self() instead...
(or the highly unportable gettid() but that would just be horrible)
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org
On 1/19/07 10:55 AM, "Arjan van de Ven" <[email protected]> wrote:
> On Fri, 2007-01-19 at 10:43 -0800, Brian McGrew wrote:
>> I have a very interesting question about something that we're seeing
>> happening with threading between Fedora Core 3 and Fedora Core 5. Running
>> on Dell PowerEdge 1800 Hardware with a Xeon processor with hyper-threading
>> turned on. Both systems are using a 2.6.16.16 kernel (MVP al la special).
>>
>> We have a multithreaded application that starts two worker threads. On
>> Fedora Core 3 both of these we use getpid() to get the PID of the thread and
>> then use set_afinity to assign each thread to it's own CPU. Both threads
>> run almost symmetrically even on their given CPU watching the system
>> monitor.
>
> this is odd; even in FC3 getpid() is supposed to return the process ID
> not the thread ID
>
>> What am I missing? What do I need to do in FC5 or the kernel or the
>> threading library to get my threads to run in symmetric parallel again???
>
> you should fix the app to use something like pthread_self() instead...
> (or the highly unportable gettid() but that would just be horrible)
-----
And on FC5 I am using pthread_self but my problem isn't simply with
pthread_self, it's with the scheduling. On FC3 both threads run
simultaneously in almost symmetric parallel. On FC5 one thread don't pick
up and start until the previous one is done. On FC3, using getpid for the
thread I could use set_afinity to force each thread to its own processor and
with FC5 I can't; so I've got one idle processor all the time.
-brian
Brian McGrew { [email protected] || [email protected] }
--
> Do not read this email while waxing that cat!
>
> And on FC5 I am using pthread_self but my problem isn't simply with
> pthread_self, it's with the scheduling.
maybe your kernel has a broken scheduler loadbalancing? you really
shouldn't have to do this manually. At all.
> On FC3 both threads run
> simultaneously in almost symmetric parallel. On FC5 one thread don't pick
> up and start until the previous one is done. On FC3, using getpid for the
> thread I could use set_afinity to force each thread to its own processor and
> with FC5 I can't; so I've got one idle processor all the time.
again you can use gettid() or pthread_self() in that call (but remember
it's a bitmask not a number); but really you shouldn't have to do this.
Try a kernel which has a non-broken load balancer?
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org
Brian McGrew wrote:
> On 1/19/07 10:55 AM, "Arjan van de Ven" <[email protected]> wrote:
>> On Fri, 2007-01-19 at 10:43 -0800, Brian McGrew wrote:
>>> I have a very interesting question about something that we're seeing
>>> happening with threading between Fedora Core 3 and Fedora Core 5. Running
>>> on Dell PowerEdge 1800 Hardware with a Xeon processor with hyper-threading
>>> turned on. Both systems are using a 2.6.16.16 kernel (MVP al la special).
>>>
>>> We have a multithreaded application that starts two worker threads. On
>>> Fedora Core 3 both of these we use getpid() to get the PID of the thread and
>>> then use set_afinity to assign each thread to it's own CPU. Both threads
>>> run almost symmetrically even on their given CPU watching the system
>>> monitor.
>> this is odd; even in FC3 getpid() is supposed to return the process ID
>> not the thread ID
>>
>>> What am I missing? What do I need to do in FC5 or the kernel or the
>>> threading library to get my threads to run in symmetric parallel again???
>> you should fix the app to use something like pthread_self() instead...
>> (or the highly unportable gettid() but that would just be horrible)
> -----
>
> And on FC5 I am using pthread_self but my problem isn't simply with
> pthread_self, it's with the scheduling. On FC3 both threads run
> simultaneously in almost symmetric parallel. On FC5 one thread don't pick
> up and start until the previous one is done. On FC3, using getpid for the
> thread I could use set_afinity to force each thread to its own processor and
> with FC5 I can't; so I've got one idle processor all the time.
>
This sounds so unlikely I hesitate to mention it, but you are not, by
any chance, running pthreads on one and nptl on the other, are you?
--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
On Fri, 19 Jan 2007 19:55:41 +0100, Arjan van de Ven <[email protected]> wrote:
> On Fri, 2007-01-19 at 10:43 -0800, Brian McGrew wrote:
> > I have a very interesting question about something that we're seeing
> > happening with threading between Fedora Core 3 and Fedora Core 5. Running
> > on Dell PowerEdge 1800 Hardware with a Xeon processor with hyper-threading
> > turned on. Both systems are using a 2.6.16.16 kernel (MVP al la special).
> >
> > We have a multithreaded application that starts two worker threads. On
> > Fedora Core 3 both of these we use getpid() to get the PID of the thread and
> > then use set_afinity to assign each thread to it's own CPU. Both threads
> > run almost symmetrically even on their given CPU watching the system
> > monitor.
>
> this is odd; even in FC3 getpid() is supposed to return the process ID
> not the thread ID
>
> > What am I missing? What do I need to do in FC5 or the kernel or the
> > threading library to get my threads to run in symmetric parallel again???
>
One thing to try. In linux, pthread_setconcurrency never did nothing
(it _really_ did in IRIX...). Can you try that ? Perhaps FC5 has implemented
some kind of scheduling policy like that on irix (everything stays on the
same CPU until it starts to suck cycles, unless you use setconcurrency).
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2007.1 (Cooker) for i586
Linux 2.6.19-jam04 (gcc 4.1.2 20061110 (prerelease) (4.1.2-0.20061110.2mdv2007.1)) #0 SMP PREEMPT