Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp3008123pxb; Tue, 24 Aug 2021 12:49:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwS2400hLiIdMKj9qAFC+dBfm4R1UeMlzEMu1QZwR1b2PjdDYOkVOUjdmgzhoGOLE4DB4tI X-Received: by 2002:a17:906:a59:: with SMTP id x25mr4618860ejf.33.1629834558949; Tue, 24 Aug 2021 12:49:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629834558; cv=none; d=google.com; s=arc-20160816; b=Ai6DYJHn2e2NziE7jvKE9itATBwC61+GIszDuSg0SWNUdPwjxddFvPMpdfherSQkLQ HPiPfDlM60b+BghF7V1e8GCpKbnysruHYgCrB1mJ4OPxKgjrbBvHvljWYDyVCGf2hvIK EIIHLS/ZIi9361XHtL2iqwkHBY31ZHkZXrxbNTuwSSZPt2FXrQYRDLHkOtpghV56VrKO rZtpNkz1t+j9X3UoblRaTznXtdh2MASL+k4/j/ejB2eZkJeWRerub5PglZolyEGVAjsJ DSrA7/Fq5JPTQCsYez33wIdNo549NfMbKfo7Xf9egRTlwGOEFsDGZvacM5ahwSbQOgPW v1sA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=wL9XExZZG1TjksJGBynBNCX9BmWUtYjN+lbigwmHn0M=; b=IJHDZNZf6cXe2/9QnCcSI2AlxlEI2vYFiPdXvr8KfR6dWv+iaju26sAQx33grj9fuQ f82Hxjfq8e5cIk51fAy7htWn1+VW6GmbW4KL+YORRmbEF2jXnVDhcgAh/fVCRFXBxWLo GW5TclD416yRUVY78b6SXxFAc/4hy13ZVcZqz0X0fL2ZtFTAF3cmUEaR3O+LxBDTNdS+ xAgWqz8m0OftJf86kU9u3BbA9ij57/Boym8jh9xwYr3T2IvQZcOJz1wm5L28Qw/xSv5j 0nr9BCLcQpNR1RXM4gx6b1lUDuFOQg5HMeovv4I8eIejoh6RaVSb2ij5cvNDpsJIk1VT aqHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ctfpWYct; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l3si19779228ejo.634.2021.08.24.12.48.51; Tue, 24 Aug 2021 12:49:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ctfpWYct; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234542AbhHXTpl (ORCPT + 99 others); Tue, 24 Aug 2021 15:45:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbhHXTpk (ORCPT ); Tue, 24 Aug 2021 15:45:40 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C193C061757; Tue, 24 Aug 2021 12:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=wL9XExZZG1TjksJGBynBNCX9BmWUtYjN+lbigwmHn0M=; b=ctfpWYct8aBG8uE1bVigySp0tj a/BQYXNynG1K0N5Ig16VGjwpoqmfbn60XxsV6i9qkXUbt9OCb/hM9vBHjG1y9P9150xXKupNh8UZ3 M9rxGEMJtLNPG1JLtra6dG80IK2peqsS+W5fnnre+Y51Qsa9EiUjSo/uuTxtzEQGFJmxiQRbX6TD1 jaBeHeqE/UasCU9/KYYu2nUuHtZZYV57sjDdVf7daHTHEJFCl7yLFMTUZqJUtZ3Nxs1Z1lSBSqIIJ R/mO3BccSg29FAFlPeQVvjEAEOB70kDdQyCweZMHpshHg48qJDmR2+gey5eZlKw23Kodtqy8t+Xwj 8X161GaA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mIcL3-00BUm5-GQ; Tue, 24 Aug 2021 19:44:14 +0000 Date: Tue, 24 Aug 2021 20:44:01 +0100 From: Matthew Wilcox To: Johannes Weiner Cc: Linus Torvalds , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [GIT PULL] Memory folios for v5.15 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 24, 2021 at 02:32:56PM -0400, Johannes Weiner wrote: > The folio doc says "It is at least as large as %PAGE_SIZE"; > folio_order() says "A folio is composed of 2^order pages"; > page_folio(), folio_pfn(), folio_nr_pages all encode a N:1 > relationship. And yes, the name implies it too. > > This is in direct conflict with what I'm talking about, where base > page granularity could become coarser than file cache granularity. That doesn't make any sense. A page is the fundamental unit of the mm. Why would we want to increase the granularity of page allocation and not increase the granularity of the file cache? > Are we going to bump struct page to 2M soon? I don't know. Here is > what I do know about 4k pages, though: > > - It's a lot of transactional overhead to manage tens of gigs of > memory in 4k pages. We're reclaiming, paging and swapping more than > ever before in our DCs, because flash provides in abundance the > low-latency IOPS required for that, and parking cold/warm workload > memory on cheap flash saves expensive RAM. But we're continously > scanning thousands of pages per second to do this. There was also > the RWF_UNCACHED thread around reclaim CPU overhead at the higher > end of buffered IO rates. There is the fact that we have a pending > proposal from Google to replace rmap because it's too CPU-intense > when paging into compressed memory pools. This seems like an argument for folios, not against them. If user memory (both anon and file) is being allocated in larger chunks, there are fewer pages to scan, less book-keeping to do, and all you're paying for that is I/O bandwidth. > - It's a lot of internal fragmentation. Compaction is becoming the > default method for allocating the majority of memory in our > servers. This is a latency concern during page faults, and a > predictability concern when we defer it to khugepaged collapsing. Again, the more memory that we allocate in higher-order chunks, the better this situation becomes. > - struct page is statically eating gigs of expensive memory on every > single machine, when only some of our workloads would require this > level of granularity for some of their memory. And that's *after* > we're fighting over every bit in that structure. That, folios does not help with. I have post-folio ideas about how to address that, but I can't realistically start working on them until folios are upstream.