Received: by 2002:a05:6358:16cd:b0:dc:6189:e246 with SMTP id r13csp2891660rwl; Sat, 5 Nov 2022 13:34:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4SpEilDmQNL0Lyk4LZ+a/bAjscjai5pjfC3TXrsEp6CqnqkfA2CvGLbwWtTQ9RsEZNRSyd X-Received: by 2002:a17:902:e313:b0:187:3c62:5825 with SMTP id q19-20020a170902e31300b001873c625825mr24217584plc.41.1667680461310; Sat, 05 Nov 2022 13:34:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667680461; cv=none; d=google.com; s=arc-20160816; b=ZUK560ueIuC1SAsHvmFct1BaqQez/cKBBCi+mtJRjRc6m0aJJPglETNLd+4Q1brfHW X7gNMDHBviNVjpiW1OZlmqGDyaxSY+x2E1+pZBftbMxDlXTXUbbu7dD38nkKWIQ27IhJ CWiw+G1qj3qV7+/wFNYVxtKqsaXanv02vsan0dlstBch6623lB9XaGp0z3CbeQcWLufG h5zbKdVQAA0P/Kv+MX+cWUTb1Fh2AtJlLw81wrzqrstdouBi1/25lfVbH0FXM1m+w1o6 kEgroVGesZFxzdWOMjUNKIC8QLa/S7iebm0KfORT+JHHw3LM64iTaoXowYE0hpLDWUZI B1rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:feedback-id :dkim-signature:dkim-signature; bh=OolEB8cwTU+D19Ary1jEoWj96aA/4SgZa+73Esw4OBQ=; b=ejeLU9wYPPl6zzGtphyh72NJTf4FOdXJUhvEwy5qQHFCAqqU6X0z63VNQ8A8FdWVYL QG3RP3az9VNQpNTCm3Djz115ky9wDhaCtxJHognHzMd48uDaGZUr1lqNHg58NQRJW+hk 04lScRDOZVrA4L95MCzsFssjoWlOS3F8EckiB7Zb7mqjQ+PORiBv2OpQYoRoLRiaKmK1 AWanCVmLVFQYLspy3uZTQ5dTGseHDlA9co3vdDTYg4zkSzKmkw/HSa2YXXegapBczpM5 51UfDpON5c3ooc0a3vuypopVPQN+yJgNUv3o8T13wpw3cDgJ7DP0vRpJ8y9+M+jC8IMt sLVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@shutemov.name header.s=fm2 header.b="BmH/hdyc"; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b="Fr/8cZXD"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e14-20020a63544e000000b004702c91d8f6si4138733pgm.116.2022.11.05.13.34.09; Sat, 05 Nov 2022 13:34:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@shutemov.name header.s=fm2 header.b="BmH/hdyc"; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b="Fr/8cZXD"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229902AbiKETvW (ORCPT + 97 others); Sat, 5 Nov 2022 15:51:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbiKETvU (ORCPT ); Sat, 5 Nov 2022 15:51:20 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF2A310571 for ; Sat, 5 Nov 2022 12:51:19 -0700 (PDT) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 2C42B5C00A6; Sat, 5 Nov 2022 15:51:19 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Sat, 05 Nov 2022 15:51:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm2; t=1667677879; x=1667764279; bh=Oo lEB8cwTU+D19Ary1jEoWj96aA/4SgZa+73Esw4OBQ=; b=BmH/hdycEoUmRs2MZv RwtQPWQL2UuAo5RkVP3z6EqzOqzx7t41JzmdrQ6phRcjbW7Dn7xMM52Vp3+gi4/s +CWKqDZ050zCPhlJadzZk2Vra6Tqbb9fHHkRs4MYoPfXHfnVm+qyoBxcbLj9AwhW hDQTjWLjZhQzcjNn1lnilLFeJLtUJlaIiP8ti0yMdadBkDFFlCDFyakbniTUDszk BBomiCv2hsy8SXyVgxcZ+POhDPrDFB8zzPq/j75RmKOq3mh/TYleCTgfwECD+rLP XDZXZZ7i/kEH28CwNI6j/asctdA4s3T2ESmvkOvpdlL4OOC+g9AuQKDQDs4fhW55 5dbA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1667677879; x=1667764279; bh=OolEB8cwTU+D19Ary1jEoWj96aA/ 4SgZa+73Esw4OBQ=; b=Fr/8cZXDQtxEoK27SVk1O4QhMm7DPH9YnMRiB2IpJMEM DKHc62OPD++hYo74KtO6ngbv4ufXuj9/rdG9TRJ9Eebrwp+SsU6LFgy2R0eHUAf5 BpiZBUm/8/lWvc4T4Y/VgLMkN4k1tpklyhbmzFFNh0K8IctkoZu+TkVMYe+5Qmyu UxQkGpaI+NkQCxicLD6ZS04hBQdeWQSagy4GF9IQF0JZCtkHKn32h0SWJ3QJ3pfz Q9dUXWX8MsRKU6G4p0zQ9cdhRp0zoc7f1ufvvjIQgHsTZ46aPmBqDapEry4k8TzG 22nHjrve0VODY6U4rR+54JbX5LKGo3exddPrGQF5sg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvgedrvdeggddugecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpeffhffvvefukfhfgggtuggjsehttdertddttddvnecuhfhrohhmpedfmfhirhhi lhhlucetrdcuufhhuhhtvghmohhvfdcuoehkihhrihhllhesshhhuhhtvghmohhvrdhnrg hmvgeqnecuggftrfgrthhtvghrnhepkedvvdejffehteegtddvgfeijeeivdegjeeiteej heeiheevffeukeefheffvdevnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe hmrghilhhfrhhomhepkhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 5 Nov 2022 15:51:17 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id E67B8104449; Sat, 5 Nov 2022 22:51:15 +0300 (+03) Date: Sat, 5 Nov 2022 22:51:15 +0300 From: "Kirill A. Shutemov" To: Hugh Dickins Cc: Andrew Morton , Matthew Wilcox , David Hildenbrand , Vlastimil Babka , Peter Xu , Yang Shi , John Hubbard , Mike Kravetz , Sidhartha Kumar , Muchun Song , Miaohe Lin , Naoya Horiguchi , Mina Almasry , James Houghton , Zach O'Keefe , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 2/3] mm,thp,rmap: simplify compound page mapcount handling Message-ID: <20221105195115.2d5yvvepdjsqjmmv@box> References: <5f52de70-975-e94f-f141-543765736181@google.com> <47ad693-717-79c8-e1ba-46c3a6602e48@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47ad693-717-79c8-e1ba-46c3a6602e48@google.com> X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 02, 2022 at 06:51:38PM -0700, Hugh Dickins wrote: > Compound page (folio) mapcount calculations have been different for > anon and file (or shmem) THPs, and involved the obscure PageDoubleMap > flag. And each huge mapping and unmapping of a file (or shmem) THP > involved atomically incrementing and decrementing the mapcount of every > subpage of that huge page, dirtying many struct page cachelines. > > Add subpages_mapcount field to the struct folio and first tail page, > so that the total of subpage mapcounts is available in one place near > the head: then page_mapcount() and total_mapcount() and page_mapped(), > and their folio equivalents, are so quick that anon and file and hugetlb > don't need to be optimized differently. Delete the unloved PageDoubleMap. > > page_add and page_remove rmap functions must now maintain the > subpages_mapcount as well as the subpage _mapcount, when dealing with > pte mappings of huge pages; and correct maintenance of NR_ANON_MAPPED > and NR_FILE_MAPPED statistics still needs reading through the subpages, > using nr_subpages_unmapped() - but only when first or last pmd mapping > finds subpages_mapcount raised (double-map case, not the common case). > > But are those counts (used to decide when to split an anon THP, and > in vmscan's pagecache_reclaimable heuristic) correctly maintained? > Not quite: since page_remove_rmap() (and also split_huge_pmd()) is > often called without page lock, there can be races when a subpage pte > mapcount 0<->1 while compound pmd mapcount 0<->1 is scanning - races > which the previous implementation had prevented. The statistics might > become inaccurate, and even drift down until they underflow through 0. > That is not good enough, but is better dealt with in a followup patch. > > Update a few comments on first and second tail page overlaid fields. > hugepage_add_new_anon_rmap() has to "increment" compound_mapcount, but > subpages_mapcount and compound_pincount are already correctly at 0, > so delete its reinitialization of compound_pincount. > > A simple 100 X munmap(mmap(2GB, MAP_SHARED|MAP_POPULATE, tmpfs), 2GB) > took 18 seconds on small pages, and used to take 1 second on huge pages, > but now takes 119 milliseconds on huge pages. Mapping by pmds a second > time used to take 860ms and now takes 92ms; mapping by pmds after mapping > by ptes (when the scan is needed) used to take 870ms and now takes 495ms. > But there might be some benchmarks which would show a slowdown, because > tail struct pages now fall out of cache until final freeing checks them. > > Signed-off-by: Hugh Dickins Thanks for doing this! Acked-by: Kirill A. Shutemov And sorry again for PageDoubleMap() :/ Minor nitpick and a question below. > @@ -829,12 +829,20 @@ static inline int folio_entire_mapcount(struct folio *folio) > > /* > * Mapcount of compound page as a whole, does not include mapped sub-pages. > - * > - * Must be called only for compound pages. > + * Must be called only on head of compound page. > */ > -static inline int compound_mapcount(struct page *page) > +static inline int head_compound_mapcount(struct page *head) > { > - return folio_entire_mapcount(page_folio(page)); > + return atomic_read(compound_mapcount_ptr(head)) + 1; > +} > + > +/* > + * Sum of mapcounts of sub-pages, does not include compound mapcount. > + * Must be called only on head of compound page. > + */ > +static inline int head_subpages_mapcount(struct page *head) > +{ > + return atomic_read(subpages_mapcount_ptr(head)); > } > > /* Any particular reason these two do not take struct folio as an input? It would guarantee that it is non-tail page. It will not guarantee large-folio, but it is something. > @@ -1265,8 +1288,6 @@ void page_add_new_anon_rmap(struct page *page, > VM_BUG_ON_PAGE(!PageTransHuge(page), page); > /* increment count (starts at -1) */ > atomic_set(compound_mapcount_ptr(page), 0); > - atomic_set(compound_pincount_ptr(page), 0); > - It has to be initialized to 0 on allocation, right? > __mod_lruvec_page_state(page, NR_ANON_THPS, nr); > } else { > /* increment count (starts at -1) */ -- Kiryl Shutsemau / Kirill A. Shutemov