2000-12-27 14:46:31

by Rogier Wolff

[permalink] [raw]
Subject: Semaphores slow???


Hi,

We have a typical semaphore application that has a producer and a
consumer.

Without the semaphores we are limited by the rest of the stuff to
10000 times around the loop per second. That's good.

When we put the "push the semaphore" call in there, the rate drops to
around 8000 per second. I'm not happy about that, but ok. When we add
the "wait for bufferspace" semaphore wait in there, the rate drops to
4000 per second. This is way too low.

Does anybody have an idea why linux semaphores are so slow?

init:
sem_init (&write_sem, 1, 0);
sem_init (&write_buffer_sem, 1, 0);



producer:


while (1) {
sem_wait (&write_buffer_sem);
...

sem_post (&write_sem);
}


consumer:
for (i=0;i<nbufs;i++) {
Create_buffer (i);
sem_post (&write_buffer_sem);
}

while (1) {
sem_wait (&write_sem);
...

sem_post (&write_buffer_sem);
}


(pshared == 0 doesn't work: returns error. The manpage claims that
thsi would happen for pshared != 0... )


Roger.


--
** [email protected] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots.
* There are also old, bald pilots.


2000-12-27 15:57:52

by J.A. Magallon

[permalink] [raw]
Subject: Re: Semaphores slow???


On 2000.12.27 Rogier Wolff wrote:
> We have a typical semaphore application that has a producer and a
> consumer.
>
> Without the semaphores we are limited by the rest of the stuff to
> 10000 times around the loop per second. That's good.
>
> When we put the "push the semaphore" call in there, the rate drops to
> around 8000 per second. I'm not happy about that, but ok. When we add
> the "wait for bufferspace" semaphore wait in there, the rate drops to
> 4000 per second. This is way too low.
>

Look (s==write_sem, bs==write_buffer_sem):

cons prod
====================
wait(bs)
post(bs)
wait(s) <wake>
<work>
post(s)
<wake> wait(bs)
<work>
post(bs)
wait(s) <wake>
<work>
post(s)
<wake> wait(bs)
<work>
post(bs)
wait(s) <wake>
.................

So there is no way that <work> can be done at the same time on
producer and consumer. So if you measure the loops per sec of the
producer (for example), in 'free run' you get 10k, in synchro run
with consumer you have just the half, because really prod and cons
are running sequentially, one after the other.

You need to thighten the mutexed zone for the thing to work in parallel
in an efficient way.

--
J.A. Magallon $> cd pub
mailto:[email protected] $> more beer

Linux werewolf 2.2.19-pre3-aa3 #3 SMP Wed Dec 27 10:25:32 CET 2000 i686

2000-12-27 17:19:28

by J.A. Magallon

[permalink] [raw]
Subject: Re: Semaphores slow???


On 2000.12.27 Rogier Wolff wrote:
> J . A . Magallon wrote:
>
> You missed that we push the bs 1000 times before we allow the producer
> to start.
>

Lets rename
write_buffer_sem == holes
write_sem == items

So:
producer:
while (1) {
// Wait a hole
sem_wait (&holes);
...
// Signal we posted an item
sem_post (&items);
}
consumer:
for (i=0;i<1000;i++)
sem_post (&holes);
/* OK, you have 1000 holes fo fill */
while (1) {
// Wait an item
sem_wait (&items);
...
// Say we left a hole
sem_post (&holes);
}

So you have a finite FIFO of size N, empty at start. There will be
a short transient state when:
- if producer is faster, queue fills; consumer is 'free-run'-alike,
there is always something to eat, but producer blocks him.
- if consumer is faster, queue exhausts and it runs as if queue size
was 1. There is always hole for a new item, but consumer has to wait
a full producer loop to get an item.
And you end up in the case of the previous post.

You don't have to keep the queue blocked while you work on items.
Try something like:
producer:
while (1) {
... (prepare item)
(and consumer can read items at the same time)
// Wait a hole
sem_wait (&holes);
put item
// Signal we posted an item
sem_post (&items);
}
consumer:
for (i=0;i<1000;i++)
sem_post (&holes);
/* OK, you have 1000 holes fo fill */
while (1) {
// Wait an item
sem_wait (&items);
get item
// Say we left a hole
sem_post (&holes);
... (use item)
(and producer can insert items at the same time)
}


--
J.A. Magallon $> cd pub
mailto:[email protected] $> more beer

Linux werewolf 2.2.19-pre3-aa3 #3 SMP Wed Dec 27 10:25:32 CET 2000 i686

2000-12-28 12:28:13

by Davide Libenzi

[permalink] [raw]
Subject: Re: Semaphores slow???

On Wed, 27 Dec 2000, Rogier Wolff wrote:
> Hi,
>
> We have a typical semaphore application that has a producer and a
> consumer.
>
> Without the semaphores we are limited by the rest of the stuff to
> 10000 times around the loop per second. That's good.
>
> When we put the "push the semaphore" call in there, the rate drops to
> around 8000 per second. I'm not happy about that, but ok. When we add
> the "wait for bufferspace" semaphore wait in there, the rate drops to
> 4000 per second. This is way too low.

This sound weird to me.
I think that linux semaphores ( not SysV IPC ) are quite fast instead.
They're based on fast spinlocks and linux scheduling ( no need of predefined
scheduling points ) code.


>
> Does anybody have an idea why linux semaphores are so slow?
>
> init:
> sem_init (&write_sem, 1, 0);
> sem_init (&write_buffer_sem, 1, 0);
>
>
>
> producer:
>
>
> while (1) {
> sem_wait (&write_buffer_sem);
> ...
>
> sem_post (&write_sem);
> }
>
>
> consumer:
> for (i=0;i<nbufs;i++) {
> Create_buffer (i);
> sem_post (&write_buffer_sem);
> }
>
> while (1) {
> sem_wait (&write_sem);
> ...
>
> sem_post (&write_buffer_sem);
> }
>

Which kind of syncronization have You used to get 10000 loops/sec ?



> (pshared == 0 doesn't work: returns error. The manpage claims that
> thsi would happen for pshared != 0... )

Which glibc version are You using ?
This is the code for sem_init() ( 2.1.3 ) :

int __new_sem_init(sem_t *sem, int pshared, unsigned int value)
{
if (value > SEM_VALUE_MAX) {
errno = EINVAL;
return -1;
}
if (pshared) {
errno = ENOSYS;
return -1;
}
__pthread_init_lock((struct _pthread_fastlock *) &sem->__sem_lock);
sem->__sem_value = value;
sem->__sem_waiting = NULL;
return 0;
}


Are You sure You're not using a pthread emulation of GNU Pth library ?




- Davide