Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp488001pxb; Thu, 23 Sep 2021 04:43:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzlyJz0CMpLcgySr9gyTf6yw/96fA3p7kjafVo88wfrJ4UD/bQsJKxBz1mVXbDXF9WdILos X-Received: by 2002:a5d:950d:: with SMTP id d13mr3403224iom.138.1632397408412; Thu, 23 Sep 2021 04:43:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632397408; cv=none; d=google.com; s=arc-20160816; b=X94DMYRX1uIJMKZfsoi+k+m4nCG3E6nEZxgp2qSJr47db7uLPdawppNCGVL3nQimvt BqPvS54jftug6kdHvB/eNVFYRV8OOjCfJzl1fbIHziMwFdt9UchbxjIz+PbzRyz+uZey 1WhHjSUL0agGadajLqCzI3F/Nd9UOOYZ+k690atRPpbysoS8M/5IYqYjqAa2f1dMlITy CqdhtPkKOIK14dZkgDbDafg6zaK7ncJ3ugmEoapS9gl0hoXBzY24FnBh9w3P1E9JLAkf /K/8GkuOOHuLw7dtA8/q0sU8njPL+nGVvlEJ2hEZrX86p+K25qfnURM11Q34Rdcmg1LT SHXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=n8Mi5cs+FlsaY04tVh3IDSphRi78FWp+NqQJpSgK8j4=; b=YnDjuM43q1aYKSGPaIGfsoe0bLAe++VABYW8xQvcS7EK+TNiODdR1K9SMRWB21OjT9 NCo4cp742W50nYc/n7l9TKQAhNh9V7jsnfJcYiLIOsNWrnYxqgMtmH7CrdfUeo+yTsFu zvcjzeu/L8CVeHfwM2qE1b+PSCmxG/dpClSRtn0ujxnDirgsRyb6Xnn4xgd1YQ0eALxv tc3C8znitlVlrBCXqz2ZJ7/QBBgnuN3HPQAJoxkXHUz60CpwG6MOIZobmGIYhb0cnkeC nkqPY7kOrEEfPvOZGhzu6R7isk9Fzuq4rOZq9Mwf/07pv6Yi+klF61p0WCZH7IHgclY1 Hgmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=CZdYXdf8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t6si5786991jaa.130.2021.09.23.04.42.54; Thu, 23 Sep 2021 04:43:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=CZdYXdf8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240649AbhIWLnh (ORCPT + 99 others); Thu, 23 Sep 2021 07:43:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238930AbhIWLng (ORCPT ); Thu, 23 Sep 2021 07:43:36 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 009F3C061574; Thu, 23 Sep 2021 04:42:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=n8Mi5cs+FlsaY04tVh3IDSphRi78FWp+NqQJpSgK8j4=; b=CZdYXdf8h6Q+xsZWV/2sdWoW23 fGrqUFqTTcmxJHkqSHaACMPLKoW6klLmT+0Vd0eR8NNVfQ6MkZtodbas0yHL/0hQ3XZyet00Zkr+Q wsp86MneIZ/eB0wK+av1BY8Y+PGOp4RH8iIYKQsIagw4eiEfZ5QqxRldiph1zS8G9Mt0Pj11YLWGz cZdd/4rP+DEcQ6jT1Kxc6iKei3g/Y4SYb+ca/K0zU5vty45q+D5sPi31AAdBotDxpJV++y8p5yLU2 iD0PrxXQ4KRffFzXD5zsGfrTHrB8MrZxc/8HcTxOT8+5h5IpvBDyRW4xcPB8oIhXxgPpFkQYJaaW6 Imo1ADlg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mTN5K-005pNi-Gx; Thu, 23 Sep 2021 11:40:31 +0000 Date: Thu, 23 Sep 2021 12:40:14 +0100 From: Matthew Wilcox To: Kent Overstreet Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Johannes Weiner , Linus Torvalds , Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , David Howells , "Kirill A. Shutemov" , Mike Kravetz Subject: Mapcount of subpages Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 23, 2021 at 01:15:16AM -0400, Kent Overstreet wrote: > On Thu, Sep 23, 2021 at 04:23:12AM +0100, Matthew Wilcox wrote: > > (compiling that list reminds me that we'll need to sort out mapcount > > on subpages when it comes time to do this. ask me if you don't know > > what i'm talking about here.) > > I am curious why we would ever need a mapcount for just part of a page, tell me > more. I would say Kirill is the expert here. My understanding: We have three different approaches to allocating 2MB pages today; anon THP, shmem THP and hugetlbfs. Hugetlbfs can only be mapped on a 2MB boundary, so it has no special handling of mapcount [1]. Anon THP always starts out as being mapped exclusively on a 2MB boundary, but then it can be split by, eg, munmap(). If it is, then the mapcount in the head page is distributed to the subpages. Shmem THP is the tricky one. You might have a 2MB page in the page cache, but then have processes which only ever map part of it. Or you might have some processes mapping it with a 2MB entry and others mapping part or all of it with 4kB entries. And then someone truncates the file to midway through this page; we split it, and now we need to figure out what the mapcount should be on each of the subpages. We handle this by using ->mapcount on each subpage to record how many non-2MB mappings there are of that specific page and using ->compound_mapcount to record how many 2MB mappings there are of the entire 2MB page. Then, when we split, we just need to distribute the compound_mapcount to each page to make it correct. We also have the PageDoubleMap flag to tell us whether anybody has this 2MB page mapped with 4kB entries, so we can skip all the summing of 4kB mapcounts if nobody has done that. [1] Mike is looking to change this, but I'm not sure where he is with it.