Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp343712pxb; Sat, 18 Sep 2021 04:35:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyzF3wFMip1npoxcMviIEYvWFBYEV1mWaf9uJspjBUSR8q4EMkreSxLNnlbKiKP5MXA5fl8 X-Received: by 2002:a17:906:a3c3:: with SMTP id ca3mr17841198ejb.337.1631964940006; Sat, 18 Sep 2021 04:35:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631964940; cv=none; d=google.com; s=arc-20160816; b=Rks0BD307DdVZyyIv26MI0C8ooYkOKDFKStvwcT6l7cInG1Lw+kbl4yGEWFh1ys9qe X5H5ebGOJYL9tJUd6KfCzWPNh9smECXZFoDfjDnnWYRjh7QoiE2tzQgoAjbgTic3oQwk bmGeTQ54t9tS0ByE0bKjjs42qCQGrLMu8FCiwwmg6KMXVEmsXluMZes2zNnkegrtwoSf tyopD54EEx/fnh9etd+cWdIfajWDRffEWvbrVH+CRYPHrtjJj5IlE/5xtQ7HAiaM+eIU N11KyY04Ii4BGFNgDL08M2ZEUKPs+7CORXeQqfJNr1ExLU67H+r4vPX7VkCv+86pof0i BRMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=RH0efJjn0QBAFoW9jq6h57kbt8lbIIQPGUEH6ixTgJs=; b=lGCFKjHCIvkqRKUKoLFq6wbMrLGutrP/zHjAQdLlJ1pc8x1EZKo1GEo99lwzWduYAf ygQDGmb4gOOg3RMRoDJh6lwS7opKiMdB2SbhTtaVhzXpFq0eEJVpstzibFhiUbCRj710 6p1dZB8ZfYDiADCkzLa00txrsST6cgv48VFrItL8mKdBHh7/35LX29WXku/vFrzhasjb BgZ1+7LvpXyp9anvs/0VV9vdhxoPcQcJU1WyKBWtNAGbDk+LBUNhfexr4wLRPWvv5KnO OfEf6Cls9OZVrSHAcawI1vQvOmgfuo8l5z1gFwtNZqBCqMMKlztklEExh2Z0ZHMmsrEm hTtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s18si11469797eji.59.2021.09.18.04.35.16; Sat, 18 Sep 2021 04:35:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231766AbhIRBGI (ORCPT + 99 others); Fri, 17 Sep 2021 21:06:08 -0400 Received: from mail105.syd.optusnet.com.au ([211.29.132.249]:42375 "EHLO mail105.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229456AbhIRBGH (ORCPT ); Fri, 17 Sep 2021 21:06:07 -0400 Received: from dread.disaster.area (pa49-195-238-16.pa.nsw.optusnet.com.au [49.195.238.16]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id A65A61050EE2; Sat, 18 Sep 2021 11:04:41 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1mROmW-00Diuy-B5; Sat, 18 Sep 2021 11:04:40 +1000 Date: Sat, 18 Sep 2021 11:04:40 +1000 From: Dave Chinner To: Johannes Weiner Cc: "Darrick J. Wong" , Kent Overstreet , Matthew Wilcox , Linus Torvalds , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , Christoph Hellwig , David Howells Subject: Re: Folio discussion recap Message-ID: <20210918010440.GK1756565@dread.disaster.area> References: <20210916025854.GE34899@magnolia> <20210917052440.GJ1756565@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=YKPhNiOx c=1 sm=1 tr=0 a=DzKKRZjfViQTE5W6EVc0VA==:117 a=DzKKRZjfViQTE5W6EVc0VA==:17 a=kj9zAlcOel0A:10 a=7QKq2e-ADPsA:10 a=7-415B0cAAAA:8 a=0nqi6NygBYswCuMazx8A:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 17, 2021 at 12:31:36PM -0400, Johannes Weiner wrote: > My question for fs folks is simply this: as long as you can pass a > folio to kmap and mmap and it knows what to do with it, is there any > filesystem relevant requirement that the folio map to 1 or more > literal "struct page", and that folio_page(), folio_nr_pages() etc be > part of the public API? In the short term, yes, we need those things in the public API. In the long term, not so much. We need something in the public API that tells us the offset and size of the folio. Lots of page cache code currently does stuff like calculate the size or iteration counts based on the difference of page->index values (i.e. number of pages) and iterate page by page. A direct conversion of such algorithms increments by folio_nr_pages() instead of 1. So stuff like this is definitely necessary as public APIs in the initial conversion. Let's face it, folio_nr_pages() is a huge improvement on directly exposing THP/compound page interfaces to filesystems and leaving them to work it out for themselves. So even in the short term, these API members represent a major step forward in mm API cleanliness. As for long term, everything in the page cache API needs to transition to byte offsets and byte counts instead of units of PAGE_SIZE and page->index. That's a more complex transition, but AFAIA that's part of the future work Willy is intended to do with folios and the folio API. Once we get away from accounting and tracking everything as units of struct page, all the public facing APIs that use those units can go away. It's fairly slow to do this, because we have so much code that is doing stuff like converting file offsets between byte counts and page counts and vice versa. And it's not necessary to do an initial conversion to folios, either. But once everything in the page cache indexing API moves to byte ranges, the need to count pages, use page counts are ranges, iterate by page index, etc all goes away and hence those APIs can also go away. As for converting between folios and pages, we'll need those sorts of APIs for the foreseeable future because low level storage layers and hardware use pages for their scatter gather arrays and at some point we've got to expose those pages from behind the folio API. Even if we replace struct page with some other hardware page descriptor, we're still going to need such translation APIs are some point in the stack.... > Or can we keep this translation layer private > to MM code? And will page_folio() be required for anything beyond the > transitional period away from pages? No idea, but as per above I think it's a largely irrelevant concern for the forseeable future because pages will be here for a long time yet. > Can we move things not used outside of MM into mm/internal.h, mark the > transitional bits of the public API as such, and move on? Sure, but that's up to you to do as a patch set on top of Willy's folio trees if you think it improves the status quo. Write the patches and present them for review just like everyone else does, and they can be discussed on their merits in that context rather than being presented as a reason for blocking current progress on folios. Cheers, Dave. -- Dave Chinner david@fromorbit.com