Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751374AbdH1UBD (ORCPT ); Mon, 28 Aug 2017 16:01:03 -0400 Received: from mga09.intel.com ([134.134.136.24]:36299 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751190AbdH1UBB (ORCPT ); Mon, 28 Aug 2017 16:01:01 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,442,1498546800"; d="scan'208";a="1008498549" Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit To: Linus Torvalds , "Liang, Kan" Cc: Mel Gorman , Peter Zijlstra , Ingo Molnar , Andi Kleen , Andrew Morton , Johannes Weiner , Jan Kara , Christopher Lameter , "Eric W . Biederman" , Davidlohr Bueso , linux-mm , Linux Kernel Mailing List References: <83f675ad385d67760da4b99cd95ee912ca7c0b44.1503677178.git.tim.c.chen@linux.intel.com> <37D7C6CF3E00A74B8858931C1DB2F077537A07E9@SHSMSX103.ccr.corp.intel.com> From: Tim Chen Message-ID: <42de956b-c504-5534-cc4f-5af1df21d49b@linux.intel.com> Date: Mon, 28 Aug 2017 13:01:00 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2889 Lines: 76 On 08/28/2017 09:48 AM, Linus Torvalds wrote: > On Mon, Aug 28, 2017 at 7:51 AM, Liang, Kan wrote: >> >> I tried this patch and https://lkml.org/lkml/2017/8/27/222 together. >> But they don't fix the issue. I can still get the similar call stack. > > So the main issue was that I *really* hated Tim's patch #2, and the > patch to clean up the page wait queue should now make his patch series > much more palatable. > > Attached is an ALMOST COMPLETELY UNTESTED forward-port of those two > patches, now without that nasty WQ_FLAG_ARRIVALS logic, because we now > always put the new entries at the end of the waitqueue. > > The attached patches just apply directly on top of plain 4.13-rc7. > > That makes patch #2 much more palatable, since it now doesn't need to > play games and worry about new arrivals. > > But note the lack of testing. I've actually booted this and am running > these two patches right now, but honestly, you should consider them > "untested" simply because I can't trigger the page waiters contention > case to begin with. > > But it's really just Tim's patches, modified for the page waitqueue > cleanup which makes patch #2 become much simpler, and now it's > palatable: it's just using the same bookmark thing that the normal > wakeup uses, no extra hacks. > > So Tim should look these over, and they should definitely be tested on > that load-from-hell that you guys have, but if this set works, at > least I'm ok with it now. > > Tim - did I miss anything? I added a "cpu_relax()" in there between > the release lock and irq and re-take it, I'm not convinced it makes > any difference, but I wanted to mark that "take a breather" thing. > > Oh, there's one more case I only realized after the patches: the > stupid add_page_wait_queue() code still adds to the head of the list. > So technically you need this too: > > diff --git a/mm/filemap.c b/mm/filemap.c > index 74123a298f53..598c3be57509 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -1061,7 +1061,7 @@ void add_page_wait_queue(struct page *page, > wait_queue_entry_t *waiter) > unsigned long flags; > > spin_lock_irqsave(&q->lock, flags); > - __add_wait_queue(q, waiter); > + __add_wait_queue_entry_tail(q, waiter); I've also found this part of the code odd that add to head, but wasn't sure about the history behind it to have changed it. Adding to tail makes things much cleaner. I'm glad that those ugly hacks that added flags and counter to track new arrivals can be discarded. The modified patchset looks fine to me. So pending Kan's test on the new code I think we are good. Thanks. Tim > SetPageWaiters(page); > spin_unlock_irqrestore(&q->lock, flags); > } > > but that only matters if you actually use the cachefiles thing, which > I hope/assume you don't. > > Linus >