Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938810AbcKWOZl (ORCPT ); Wed, 23 Nov 2016 09:25:41 -0500 Received: from mail-wm0-f48.google.com ([74.125.82.48]:35198 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756601AbcKWOZd (ORCPT ); Wed, 23 Nov 2016 09:25:33 -0500 Date: Wed, 23 Nov 2016 15:25:25 +0100 From: Daniel Vetter To: Peter Zijlstra Cc: Nicolai =?iso-8859-1?Q?H=E4hnle?= , Nicolai =?iso-8859-1?Q?H=E4hnle?= , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Ingo Molnar , stable@vger.kernel.org, Maarten Lankhorst Subject: Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Message-ID: <20161123142525.ns2pkyp4bo2sa5z2@phenom.ffwll.local> Mail-Followup-To: Peter Zijlstra , Nicolai =?iso-8859-1?Q?H=E4hnle?= , Nicolai =?iso-8859-1?Q?H=E4hnle?= , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Ingo Molnar , stable@vger.kernel.org, Maarten Lankhorst References: <1479900325-28358-1-git-send-email-nhaehnle@gmail.com> <20161123140336.GU3092@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20161123140336.GU3092@twins.programming.kicks-ass.net> X-Operating-System: Linux phenom 4.8.0-1-amd64 User-Agent: NeoMutt/20161104 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2087 Lines: 48 On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote: > On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H?hnle wrote: > > @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock) > > */ > > mutex_clear_owner(&lock->base); > > #endif > > - __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath); > > + /* > > + * A previously _not_ waiting task may acquire the lock via the fast > > + * path during our unlock. In that case, already waiting tasks may have > > + * to back off to avoid a deadlock. Wake up all waiters so that they > > + * can check their acquire context stamp against the new owner. > > + */ > > + __mutex_fastpath_unlock(&lock->base.count, > > + __mutex_unlock_slowpath_wakeall); > > } > > So doing a wake-all has obvious issues with thundering herd etc.. Also, > with the new mutex, you'd not be able to do hand-off, which would > introduce starvation cases. > > Ideally we'd iterate the blocked list and pick the waiter with the > earliest stamp, or we'd maintain the list in stamp order instead of > FIFO, for ww_mutex. Not sure we'll win that much, at least I think we still need to wake up everyone with earlier stamp than the one of the task that just released the lock. Otherwise there's deadlocks. So just cuts the wakeups in half, on average. What we could do is do a handoff-hint with the timestamp of earliest task we believe should get the lock. Everyone with a later timestamp that gets woken then knows that they definitely have a deadlock situation and need to back off (thread 2 in the example). thread 1 would get woken, and would be able to take the lock, except when thread 0 successfully raced it and stole the lock. And anyone else racing in with later timestamps would also immediately back off, ensuring fairness. Without thinking it through in detail this is a PI issue, except that we replace boosting with wakeup&back-off. Could we perhaps steal something from rt mutexes to make it fair&efficient? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch