Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754317AbdHZCya (ORCPT ); Fri, 25 Aug 2017 22:54:30 -0400 Received: from mail-oi0-f48.google.com ([209.85.218.48]:33457 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754079AbdHZCy2 (ORCPT ); Fri, 25 Aug 2017 22:54:28 -0400 MIME-Version: 1.0 In-Reply-To: References: <83f675ad385d67760da4b99cd95ee912ca7c0b44.1503677178.git.tim.c.chen@linux.intel.com> From: Linus Torvalds Date: Fri, 25 Aug 2017 19:54:27 -0700 X-Google-Sender-Auth: pZP-FhN3cAcDBKyf-5d8lUnfsIk Message-ID: Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit To: Tim Chen Cc: Mel Gorman , Peter Zijlstra , Ingo Molnar , Andi Kleen , Kan Liang , Andrew Morton , Johannes Weiner , Jan Kara , Christopher Lameter , "Eric W . Biederman" , Davidlohr Bueso , linux-mm , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1310 Lines: 36 On Fri, Aug 25, 2017 at 5:31 PM, Linus Torvalds wrote: > > It made it way more fragile and complicated, having to rewrite things > so carefully. A simple slab cache would likely be a lot cleaner and > simpler. It also turns out that despite all the interfaces, we only really ever wait on two different bits: PG_locked and PG_writeback. Nothing else. Even the add_page_wait_queue() thing, which looks oh-so-generic, really only waits on PG_locked. And the PG_writeback case never really cares for the "locked" case, so this incredibly generic interface that allows you to wait on any bit you want, and has the whole exclusive wait support for getting exclusive access to the bit really only has three cases: - wait for locked exclusive (wake up first waiter when unlocked) - wait for locked (wake up all waiters when unlocked) - wait for writeback (wake up all waiters when no longer under writeback) and those last two could probably even share the same queue. But even without sharing the same queue, we could just do a per-page allocation for the three queues - and probably that stupiud add_page_wait_queue() waitqueue too. So no "per-page and per-bit" thing, just a per-page thing. I'll try writing that up. Simplify, simplify, simplify. Linus