Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751186AbdH1FR7 (ORCPT ); Mon, 28 Aug 2017 01:17:59 -0400 Received: from mail-it0-f45.google.com ([209.85.214.45]:35497 "EHLO mail-it0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750735AbdH1FR5 (ORCPT ); Mon, 28 Aug 2017 01:17:57 -0400 MIME-Version: 1.0 In-Reply-To: <20170828112959.05622961@roar.ozlabs.ibm.com> References: <83f675ad385d67760da4b99cd95ee912ca7c0b44.1503677178.git.tim.c.chen@linux.intel.com> <20170828111648.22f81bc5@roar.ozlabs.ibm.com> <20170828112959.05622961@roar.ozlabs.ibm.com> From: Linus Torvalds Date: Sun, 27 Aug 2017 22:17:55 -0700 X-Google-Sender-Auth: a4N09sANZGFOJ8zovFWVLF5w8Bw Message-ID: Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit To: Nicholas Piggin Cc: Tim Chen , Mel Gorman , Peter Zijlstra , Ingo Molnar , Andi Kleen , Kan Liang , Andrew Morton , Johannes Weiner , Jan Kara , Christopher Lameter , "Eric W . Biederman" , Davidlohr Bueso , linux-mm , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2470 Lines: 60 On Sun, Aug 27, 2017 at 6:29 PM, Nicholas Piggin wrote: > > BTW. since you are looking at this stuff, one other small problem I remember > with exclusive waiters is that losing to a concurrent locker puts them to > the back of the queue. I think that could be fixed with some small change to > the wait loops (first add to tail, then retries add to head). Thoughts? No, not that way. First off, it's oddly complicated, but more importantly, the real unfairness you lose to is not other things on the wait queue, but to other lockers that aren't on the wait-queue at all, but instead just come in and do a "test-and-set" without ever even going through the slow path. So instead of playing queuing games, you'd need to just change the unlock sequence. Right now we basically do: - clear lock bit and atomically test if contended (and we play games with bit numbering to do that atomic test efficiently) - if contended, wake things up and you'd change the logic to be - if contended, don't clear the lock bit at all, just transfer the lock ownership directly to the waiters by walking the wait list - clear the lock bit only once there are no more wait entries (either because there were no waiters at all, or because all the entries were just waiting for the lock to be released) which is certainly doable with a couple of small extensions to the page wait key data structure. But most of my clever schemes the last few days were abject failures, and honestly, it's late in the rc. In fact, this late in the game I probably wouldn't even have committed the small cleanups I did if it wasn't for the fact that thinking of the whole WQ_FLAG_EXCLUSIVE bit made me find the bug. So the cleanups were actually what got me to look at the problem in the first place, and then I went "I'm going to commit the cleanup, and then I can think about the bug I just found". I'm just happy that the fix seems to be trivial. I was afraid I'd have to do something nastier (like have the EINTR case send another explicit wakeup to make up for the lost one, or some ugly hack like that). It was only when I started looking at the history of that code, and I saw the old bit_lock code, and I went "Hmm. That has the _same_ bug - oh wait, no it doesn't!" that I realized that there was that simple fix. You weren't cc'd on the earlier part of the discussion, you only got added when I realized what the history and simple fix was. Linus