Message-ID: <XFMail.20010312011030.davidel@xmailserver.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <20010312005448.A5439@linuxcare.com>
Date: Mon, 12 Mar 2001 01:10:30 +0100 (CET)
From: Davide Libenzi <davidel@xmailserver.org>
To: Anton Blanchard <anton@linuxcare.com.au>
Subject: Re: sys_sched_yield fast path
Cc: linux-kernel@vger.kernel.org, Andi Kleen <ak@suse.de>
Sender: linux-kernel-owner@vger.kernel.org


On 11-Mar-2001 Anton Blanchard wrote:
>  
>> This is the linux thread spinlock acquire :
>> 
>> 
>> static void __pthread_acquire(int * spinlock)
>> {
>>   int cnt = 0;
>>   struct timespec tm;
>> 
>>   while (testandset(spinlock)) {
>>     if (cnt < MAX_SPIN_COUNT) {
>>       sched_yield();
>>       cnt++;
>>     } else {
>>       tm.tv_sec = 0;
>>       tm.tv_nsec = SPIN_SLEEP_DURATION;
>>       nanosleep(&tm, NULL);
>>       cnt = 0;
>>     }
>>   }
>> }
>> 
>> 
>> Yes, it calls sched_yield() but this is not a std wait for mutex but for
>> spinlocks that are hold a very short time.  Real wait are implemented using
>> signals.  More, with the new implementation of sys_sched_yield() the task
>> release all its time quantum so, even in a case where a task repeatedly
>> calls
>> sched_yield() the call rate is not so high if there is at least one process
>> to spin.  And if there isn't one task with goodness() > 0, nobody cares
>> about
>> sched_yield() performance.
> 
> The problem I found with sched_yield is that things break down with high
> levels of contention. If you have 3 processes and one has a lock then
> the other two can ping pong doing sched_yield() until their priority drops
> below the process with the lock. eg in a run I just did then where 2
> has the lock:
> 
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 1
> 0
> 2
> 
> Perhaps we need something like sched_yield that takes off some of 
> tsk->counter so the task with the spinlock will run earlier.

2.4.x has changed the scheduler behaviour so that the task that call
sched_yield() is not rescheduled by the incoming schedule().
A flag is set ( under certain conditions in SMP ) and the goodness()
calculation assign the lower value to the exiting task ( this flag is cleared
in schedule_tail() ).
This could give the task owning the lock the opportunity to complete the locked
code.
But yes, if the locked code is rescheduled for some reason ( timeslice or I/O )
the yielding task will run again.
But this is a software design problem, not a sched_yield() one coz, if the time
path between lock ans unlock can be high the use of sched_yield() is not the
best way to wait.
Wait queue or user space equivalences are a better choice to do this.


- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/