Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp559455pxb; Wed, 22 Sep 2021 08:09:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6/n/vCczpAYQR2fO8y94dClUzjH3zUqMfJzd8oYi3KGpK+vd1QHBr0xC4xMRGasAYnEdV X-Received: by 2002:a17:906:a382:: with SMTP id k2mr44615ejz.454.1632323367344; Wed, 22 Sep 2021 08:09:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632323367; cv=none; d=google.com; s=arc-20160816; b=z9Ee9IXvXVEDx6nApnMJzOD1r4tbFA/FEIlymWlJQYtNZtFSE9HbgF+nLpB9UGFrUH OQWBGh+lobqDBI5rmgHR3hMqBIupkBh0rQBFWF5kSnmBU1bGvsbP5Kg2VKATWF/+o0tf NhVWBEFIGTEcjXYf9buMR9NXScEs/fdDqMuMfi2x5kP+YeX5l7LgYNlHp3rZuivUCzW5 Mw+zvzG/1XQRiXeKOeTlStxecPwsQ1DOKZO3HlhhHxZQ4C+jdIsTV93WdE+gstwk+8+5 7RP93laO0yDuTvaIkrIdJq3M2JyGbXvN32PL5PHw3DbYrBE4gwfvC0JeT5+6bkBr5l/m j20A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=jquxLW4FijHNJV9XMN41LgyMRsCI/pVfpuzgC3iRpOQ=; b=Cji+5oXcznc4WR6xl+yLJuWeCk29nc3fS30yMOns0rhgfYUnHfi8YTlLn9xaYq8ynu fUmZ4VXt8QWtDaA6PrHGvKC/EH6imSv03JrqHjiHdSeG3LKkIxmwCB7eArqeK80EkkTN C3x0gFILS8Ih/IKQyI4rX36QCk5sxDx8CUq/2BgSRycyYdgV5qEezN+WNBqGfvAPQn5j jBNd1devBBc9LrjdfL/B9d6hV2W+wgAaM4shTQBZvpNFX5ssuPoSPETdcg4Uo3wGsdN9 EyKKiAuJeBX2venFLJ4cbipb94zunbNVubyQXRWoNrScptTaWK11720vPJLZeHXfIako o6Cg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=bvUu8CJq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l7si3772279edb.409.2021.09.22.08.08.56; Wed, 22 Sep 2021 08:09:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=bvUu8CJq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233737AbhIVPIi (ORCPT + 99 others); Wed, 22 Sep 2021 11:08:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232274AbhIVPIg (ORCPT ); Wed, 22 Sep 2021 11:08:36 -0400 Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E3D1C061756 for ; Wed, 22 Sep 2021 08:07:06 -0700 (PDT) Received: by mail-qv1-xf31.google.com with SMTP id r18so2125139qvy.8 for ; Wed, 22 Sep 2021 08:07:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=jquxLW4FijHNJV9XMN41LgyMRsCI/pVfpuzgC3iRpOQ=; b=bvUu8CJqcgCrHpo711SUodlSNU+whn+PHNlbSDhYqjSjFgsJDJKhVqqYieQkpvD4D5 v2FGatFjIweuNDUImbNTgtqMeN5kS8Mem9lC4NLutvQM+tz99O+z8inVYx/kLNH6VqIb zO/DU6S22ch3H0pbIDftVwfM+Jj7oVFZFXSvyXEpvJ6sYZTGw6ZbDNax3ONHbCJIV8c1 DAN3HyhnkxpeR51efE1ayvCpr/XKI+rjiNYogX4Z5SNH/Oal/YpPtrLzZ5q35mtmJFXY OOvZiAuNel01SJBeyzTiMJfXloUsvY8ApYKVgLCba4tp25Ofgx2LNqPCjJmWgsTwP76g l9GA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=jquxLW4FijHNJV9XMN41LgyMRsCI/pVfpuzgC3iRpOQ=; b=XXGVtK1TJAVgiMGBzL6cBv5IbCCeis8Mh+yyBhq/gruHjtA0tnjyZIHs9upR2UVd2D /fODmg46wHL5NXaecguKOIECGz3KomDad4oWh1XGELp+0BRwy9XeZyGqTV0yrFOP+0gR RKk+aXqg9SKmhMGJtVV8/umGQ/lpEzz9c74NUIaYQVf/l8NZZ3PG3SwpoKjr8yIADFxc Rgkylwz1BbZZnv+E+i6LUom/brwaGpYQkqYV/Y3teRBNZkXw3FBdJuzdgfnFstHXZWhs QzEcf/hSxyvmuMdUpAmT5UjXf2q6cBZJPBqllZxSMpqdSx33OpY4owaduNb1NKXfvWEp 91Uw== X-Gm-Message-State: AOAM530sSokKDQcD7vEYioYff97nCXWzErk6bVQX2iLySGG8cxl/2xw7 j00Z0NdR18o+rw6T40MozCwebA== X-Received: by 2002:ad4:4705:: with SMTP id k5mr24423690qvz.55.1632323218426; Wed, 22 Sep 2021 08:06:58 -0700 (PDT) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id j26sm1567616qtr.53.2021.09.22.08.06.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Sep 2021 08:06:57 -0700 (PDT) Date: Wed, 22 Sep 2021 11:08:58 -0400 From: Johannes Weiner To: Kent Overstreet Cc: Linus Torvalds , Matthew Wilcox , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , David Howells Subject: Re: Folios for 5.15 request - Was: re: Folio discussion recap - Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 21, 2021 at 05:22:54PM -0400, Kent Overstreet wrote: > - it's become apparent that there haven't been any real objections to the code > that was queued up for 5.15. There _are_ very real discussions and points of > contention still to be decided and resolved for the work beyond file backed > pages, but those discussions were what derailed the more modest, and more > badly needed, work that affects everyone in filesystem land Unfortunately, I think this is a result of me wanting to discuss a way forward rather than a way back. To clarify: I do very much object to the code as currently queued up, and not just to a vague future direction. The patches add and convert a lot of complicated code to provision for a future we do not agree on. The indirections it adds, and the hybrid state it leaves the tree in, make it directly more difficult to work with and understand the MM code base. Stuff that isn't needed for exposing folios to the filesystems. As Willy has repeatedly expressed a take-it-or-leave-it attitude in response to my feedback, I'm not excited about merging this now and potentially leaving quite a bit of cleanup work to others if the downstream discussion don't go to his liking. Here is the roughly annotated pull request: mm: Convert get_page_unless_zero() to return bool mm: Introduce struct folio mm: Add folio_pgdat(), folio_zone() and folio_zonenum() mm/vmstat: Add functions to account folio statistics Used internally and not *really* needed for filesystem folios... There are a couple of callsites in mm/page-writeback.c so I suppose it's ok. mm/debug: Add VM_BUG_ON_FOLIO() and VM_WARN_ON_ONCE_FOLIO() mm: Add folio reference count functions mm: Add folio_put() mm: Add folio_get() mm: Add folio_try_get_rcu() mm: Add folio flag manipulation functions mm/lru: Add folio LRU functions The LRU code is used by anon and file and not needed for the filesystem API. And as discussed, there is generally no ambiguity of tail pages on the LRU list. mm: Handle per-folio private data mm/filemap: Add folio_index(), folio_file_page() and folio_contains() mm/filemap: Add folio_next_index() mm/filemap: Add folio_pos() and folio_file_pos() mm/util: Add folio_mapping() and folio_file_mapping() mm/filemap: Add folio_unlock() mm/filemap: Add folio_lock() mm/filemap: Add folio_lock_killable() mm/filemap: Add __folio_lock_async() mm/filemap: Add folio_wait_locked() mm/filemap: Add __folio_lock_or_retry() mm/swap: Add folio_rotate_reclaimable() More LRU code, although this one is only used by page-writeback... I suppose. mm/filemap: Add folio_end_writeback() mm/writeback: Add folio_wait_writeback() mm/writeback: Add folio_wait_stable() mm/filemap: Add folio_wait_bit() mm/filemap: Add folio_wake_bit() mm/filemap: Convert page wait queues to be folios mm/filemap: Add folio private_2 functions fs/netfs: Add folio fscache functions mm: Add folio_mapped() mm: Add folio_nid() mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics() mm/memcg: Use the node id in mem_cgroup_update_tree() mm/memcg: Remove soft_limit_tree_node() mm/memcg: Convert memcg_check_events to take a node ID These are nice cleanups, unrelated to folios. Ack. mm/memcg: Add folio_memcg() and related functions mm/memcg: Convert commit_charge() to take a folio mm/memcg: Convert mem_cgroup_charge() to take a folio mm/memcg: Convert uncharge_page() to uncharge_folio() mm/memcg: Convert mem_cgroup_uncharge() to take a folio mm/memcg: Convert mem_cgroup_migrate() to take folios mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath() to folio mm/memcg: Add folio_memcg_lock() and folio_memcg_unlock() mm/memcg: Convert mem_cgroup_move_account() to use a folio mm/memcg: Add folio_lruvec() mm/memcg: Add folio_lruvec_lock() and similar functions mm/memcg: Add folio_lruvec_relock_irq() and folio_lruvec_relock_irqsave() mm/workingset: Convert workingset_activation to take a folio This is all anon+file stuff, not needed for filesystem folios. As per the other email, no conceptual entry point for tail pages into either subsystem, so no ambiguity around the necessity of any compound_head() calls, directly or indirectly. It's easy to rule out wholesale, so there is no justification for incrementally annotating every single use of the page. NAK. mm: Add folio_pfn() mm: Add folio_raw_mapping() mm: Add flush_dcache_folio() mm: Add kmap_local_folio() mm: Add arch_make_folio_accessible() mm: Add folio_young and folio_idle mm/swap: Add folio_activate() mm/swap: Add folio_mark_accessed() This is anon+file aging stuff, not needed. mm/rmap: Add folio_mkclean() mm/migrate: Add folio_migrate_mapping() mm/migrate: Add folio_migrate_flags() mm/migrate: Add folio_migrate_copy() More anon+file conversion, not needed. mm/writeback: Rename __add_wb_stat() to wb_stat_mod() flex_proportions: Allow N events instead of 1 mm/writeback: Change __wb_writeout_inc() to __wb_writeout_add() mm/writeback: Add __folio_end_writeback() mm/writeback: Add folio_start_writeback() mm/writeback: Add folio_mark_dirty() mm/writeback: Add __folio_mark_dirty() mm/writeback: Convert tracing writeback_page_template to folios mm/writeback: Add filemap_dirty_folio() mm/writeback: Add folio_account_cleaned() mm/writeback: Add folio_cancel_dirty() mm/writeback: Add folio_clear_dirty_for_io() mm/writeback: Add folio_account_redirty() mm/writeback: Add folio_redirty_for_writepage() mm/filemap: Add i_blocks_per_folio() mm/filemap: Add folio_mkwrite_check_truncate() mm/filemap: Add readahead_folio() mm/workingset: Convert workingset_refault() to take a folio Anon+file, not needed. NAK. mm: Add folio_evictable() mm/lru: Convert __pagevec_lru_add_fn to take a folio mm/lru: Add folio_add_lru() LRU code, not needed. mm/page_alloc: Add folio allocation functions mm/filemap: Add filemap_alloc_folio mm/filemap: Add filemap_add_folio() mm/filemap: Convert mapping_get_entry to return a folio mm/filemap: Add filemap_get_folio mm/filemap: Add FGP_STABLE mm/writeback: Add folio_write_one I'm counting about a thousand of lines of contentious LOC that clearly aren't necessary for exposing folios to the filesystems. The rest of these are pagecache and writeback. It's still a ton of (internal) code converted to folios that has conceptually little to no ambiguity about head and tail pages. As per the other email I still think it would have been good to have a high-level discussion about the *legitimate* entry points and data structures that will continue to deal with tail pages down the line. To scope the actual problem that is being addressed by this inverted/whitelist approach - so we don't annotate the entire world just to box in a handful of page table walkers... But oh well. Not a hill I care to die on at this point...