Subject: Re: [PATCH 00/11] locking/ww_mutex: Keep sorted wait list to avoid
 stampedes
To: Chris Wilson <chris@chris-wilson.co.uk>
References: <1480335612-12069-1-git-send-email-nhaehnle@gmail.com>
 <20161130094034.GM23336@nuc-i3427.alporthouse.com>
 <0e930160-2536-97d6-06a3-07cc0b1df651@gmail.com>
 <20161130122029.GP23336@nuc-i3427.alporthouse.com>
Cc: linux-kernel@vger.kernel.org
From: =?UTF-8?Q?Nicolai_H=c3=a4hnle?= <nhaehnle@gmail.com>
Message-ID: <a1a917c1-94ee-47a9-e41a-22d596aca9e0@gmail.com>
Date: Wed, 30 Nov 2016 14:40:51 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <20161130122029.GP23336@nuc-i3427.alporthouse.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2035
Lines: 43

On 30.11.2016 13:20, Chris Wilson wrote:
> On Wed, Nov 30, 2016 at 12:52:28PM +0100, Nicolai H?hnle wrote:
>> On 30.11.2016 10:40, Chris Wilson wrote:
>>> On Mon, Nov 28, 2016 at 01:20:01PM +0100, Nicolai H?hnle wrote:
>>>> I've included timings taken from a contention-heavy stress test to some of
>>>> the patches. The stress test performs actual GPU operations which take a
>>>> good chunk of the wall time, but even so, the series still manages to
>>>> improve the wall time quite a bit.
>>>
>>> In looking at your contention scenarios, what was the average/max list
>>> size? Just wondering if it makes sense to use an rbtree + first_waiter
>>> instead of a sorted list from the start.
>>
>> I haven't measured this with the new series; previously, while I was
>> debugging the deadlock on older kernels, I occasionally saw wait
>> lists of up to ~20 tasks, spit-balling the average over all the
>> deadlock cases I'd say the average was not more than ~5. The average
>> _without_ deadlocks should be lower, if anything.
>
> Right, I wasn't expecting the list to be large, certainly no larger than
> cores typically. On the borderline of where a more complex tree starts
> to pay off.
>
>> I saw that your test cases go quite a bit higher, but even the
>> rather extreme load I was testing with -- which is not quite a load
>> from an actual application, though it is related to one -- has 40
>> threads and so a theoretical maximum of 40.
>
> The stress loads were just values plucked out of nowhere to try and have
> a reasonable stab at hitting the deadlock. Certainly if we were to wrap
> that up in a microbenchmark we would want to have wider coverage (so the
> graph against contention is more useful).
>
> Do you have a branch I can pull the patches for (or what did you use as
> the base)?

See git://people.freedesktop.org/~nh/linux mutex or 
https://cgit.freedesktop.org/~nh/linux/log/?h=mutex.

It's based on tip/core/locking + agd5f's drm-next, the latter only 
because I needed it for the test application.

Nicolai