Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753848AbbGBAQ4 (ORCPT ); Wed, 1 Jul 2015 20:16:56 -0400 Received: from cantor2.suse.de ([195.135.220.15]:47214 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752059AbbGBAQt (ORCPT ); Wed, 1 Jul 2015 20:16:49 -0400 Message-ID: <1435796198.25652.121.camel@stgolabs.net> Subject: Re: [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner From: Davidlohr Bueso To: Thomas Gleixner Cc: Ingo Molnar , Peter Zijlstra , Darren Hart , Steven Rostedt , Mike Galbraith , "Paul E. McKenney" , Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org Date: Wed, 01 Jul 2015 17:16:38 -0700 In-Reply-To: References: <1435782588-4177-1-git-send-email-dave@stgolabs.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4007 Lines: 104 On Thu, 2015-07-02 at 00:27 +0200, Thomas Gleixner wrote: > On Wed, 1 Jul 2015, Davidlohr Bueso wrote: > > > Similar to what we have in other locks, particularly regular mutexes, the > > idea is that as long as the owner is running, there is a fair chance it'll > > release the lock soon, and thus a task trying to acquire the rtmutex will > > better off spinning instead of blocking immediately after the fastpath. > > Conditions to stop spinning and enter the slowpath are simple: > > > > (1) Upon need_resched() > > (2) Current lock owner blocks > > > > Because rtmutexes track the lock owner atomically, we can extend the fastpath > > to continue polling on the lock owner via cmpxchg(lock->owner, NULL, current). > > > > However, this is a conservative approach, such that if there are any waiters > > in-line, we stop spinning and immediately take the traditional slowpath. This > > allows priority boosting to take precedence over spinning, as otherwise we > > could starve a higher priority queued-up task (ie: top waiter) if spinners > > constantly steal the lock. > > I'm a bit wary about the whole approach. In the RT tree we spin AFTER > we've enqueued the waiter and run priority boosting. While I can see > the charm of your approach, i.e. avoiding the prio boost dance for the > simple case, this can introduce larger latencies. > > T1 (prio = 0) T2 (prio = 50) > lock(RTM); > lock(RTM); > spin() > -->preemption > T3 (prio = 10) leave spin, because owner is not on cpu > > enqueue(); > boost(); > schedule(); > -->preemption > T1 (prio = 50) > > So we trade two extra context switches in the worst case for an > enhancement of performance in the normal case. I cannot quantify the > impact of this, but we really need to evaluate that proper before > going there. This is a very good point. My first thought is that for a general purpose OS, the extra latency is probably ok -- if you _really_ rely on rt characteristics enough to care about this, you should be using the rt-patchset to begin with, methinks. Aside from the non-blocking performance benefits of spinning, avoiding the wait_lock (and the pi_lock in the case of doing the spinning AFTER the boosting) can ease a lot of the lock contention. > Aside of that, if the lock is really contended, then you force all > spinners off the cpu, if one of the spinners starts blocking simply > because you have no idea which one is the top prio spinner. > > T1 (prio = 0) T2 (prio = 50) T3 (prio = 10) > lock(RTM); > lock(RTM); lock(RTM); > spin() spin(); > --> preemption > enqueue() > boost(); > schedule(); > sees waiter bit > enqueue(); > boost(); > schedule(); > > T2 could happily keep spinning despite T3 going to sleep. I'm not sure > if that's what we want to achieve. Yeah this is one of the reasons why I was tempted of checking the top-waiter prio against current prio to possibly keep spinning. Now doing so without holding the wait_lock is obviously racy, however safe afaict. If the top-waiter changes after it is checked by the spinner, and we get it wrong, at worst we send that thread falsely to sleep, otherwise we bogus spin once more -- neither of which is the end of the world. Ie: [top waiter prio 10] T1 (prio 0) T2 T3 (prio 0) lock(RTM) if (rt_mutex_has_waiters(lock) && lock(RTM) rt_mutex_top_waiter()->prio > current->prio){ spin ... [release lock] [wakeup top waiter] [adds itself to the tree] } But I could be overlooking something, which is why I chose to exclude it in this patch. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/