Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp35098277rwd; Mon, 10 Jul 2023 02:29:46 -0700 (PDT) X-Google-Smtp-Source: APBJJlE1kQ/Nf8ZUeABNChFLwV18sei3s+nCj7qkrbVlyVgPIreiqY/fweeMYmp8dMgVmn4qGchy X-Received: by 2002:a05:6a00:15d3:b0:67f:1d30:9e51 with SMTP id o19-20020a056a0015d300b0067f1d309e51mr16873528pfu.33.1688981386046; Mon, 10 Jul 2023 02:29:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688981386; cv=none; d=google.com; s=arc-20160816; b=bFOzMzNpv+4fByI++sialn9uI8/yOJWaEh9YB9OL+BPeRAqy1V8fqRkFburmQySezY EDN04eK4H0P9knHit4bWNYoVB16VtaHTuQ+SSGEhsPFdZ3aeyKLmIAkFaUhOYwEMpleu Lbpnz6/Y07Y0rcFSOHh0MP58UCOdIGeerm8N47TVMywxVlLodayNlqRgaSirQd9lJ9xV c2di6gvaNlr8bFcYSoWGYaS3J1JR+Q5I1XJw42CLDj09WNe2Fch6fLDRYQjq8Ze0Pbby neSJzjMbkr0qEJIUX7WPg7hUo8T0QQWkI+tIqkqNhGRZgNnNlctLfLpETHwIcYlYT/8U vxcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=EfMw98mlT2j78plgT/2t1NjF7QNj2ocbUDFC0jDy0pw=; fh=NMmmgOdikptkv+fAVT5xP9benJqm6r8H2b98fa6Zzec=; b=iuyw+H8ulorIoCTy5LQ34YOmKOVqdSoB/kSwwn+IiIJ0+ujndwRLDkc5zQQpMCC4kS BzALE96ZWGJf3ug03nKrmCLp3o0Jn33QOEuEwbu4pXt7XADgcnXKDN36aU5hJZobywND ZWvan9VPHj0V1I8rdY062GgFumiNQRayAVnFKlQ250NMJJzf8/OQkDB+jB+eXgZa3SoB VNEzoBmiFNA8r+8VhY0IAHSNYDPozFO7fU1itJuy3+klBfMbZdbwo5D00vDEwBkJbjL4 +ppLnjxttiQa5ECSAOn5BHfIcN2wXWJaXf12FbfZpNgbq0gb8FiNda2Zn4ckw0tNBu/2 bw8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IFSF2mb0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c3-20020aa78803000000b006762f8fe108si8246030pfo.111.2023.07.10.02.29.33; Mon, 10 Jul 2023 02:29:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IFSF2mb0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229560AbjGJJEI (ORCPT + 99 others); Mon, 10 Jul 2023 05:04:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232797AbjGJJDu (ORCPT ); Mon, 10 Jul 2023 05:03:50 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53CEA103 for ; Mon, 10 Jul 2023 02:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688979800; x=1720515800; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=vZU5Dq8OZ7D/sG+nefR6vxZotXrdySTAym5CodgVrAY=; b=IFSF2mb0c/p9WyE1PYtF+MdTReF41yV7f4N7QYqNfDtWsaOS9/mWZfW3 gEcFJOKuSgGw/nmfMfiy0IE74gBeXgRrKZooJF+RzMDtniMsF2236q8t7 pMNym/nL5zhjEurz/kE7TiD/+ULWJtJhpQZ+nz9ipcA1yDObfUiRoGoB5 2l8e9r7CLDQeFVE6dq0N3Q3yx/EO9ILnd9DJx9CTa9uALv159vIAEH1VZ rAxy8lI0RN/0K2qQYeECAMWveKdvlTJpjIoL/RtbbjdrEthhkHENuNq4w mDPc/F3R6y7rUBgi3cmhFeBQFcvuYUbS4w06F//8uzHcLJbrxH4xK9wjR Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="349093719" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="349093719" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:03:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="967358364" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="967358364" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:03:16 -0700 From: "Huang, Ying" To: Ryan Roberts Cc: Andrew Morton , Matthew Wilcox , "Kirill A. Shutemov" , Yin Fengwei , "David Hildenbrand" , Yu Zhao , "Catalin Marinas" , Will Deacon , "Anshuman Khandual" , Yang Shi , , , Subject: Re: [PATCH v2 2/5] mm: Allow deferred splitting of arbitrary large anon folios References: <20230703135330.1865927-1-ryan.roberts@arm.com> <20230703135330.1865927-3-ryan.roberts@arm.com> <877crcgmj1.fsf@yhuang6-desk2.ccr.corp.intel.com> <6379dd13-551e-3c73-422a-56ce40b27deb@arm.com> <87ttucfht7.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Mon, 10 Jul 2023 17:01:39 +0800 In-Reply-To: (Ryan Roberts's message of "Mon, 10 Jul 2023 09:29:57 +0100") Message-ID: <878rbof8cs.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ryan Roberts writes: > On 10/07/2023 06:37, Huang, Ying wrote: >> Ryan Roberts writes: >> >>> Somehow I managed to reply only to the linux-arm-kernel list on first attempt so >>> resending: >>> >>> On 07/07/2023 09:21, Huang, Ying wrote: >>>> Ryan Roberts writes: >>>> >>>>> With the introduction of large folios for anonymous memory, we would >>>>> like to be able to split them when they have unmapped subpages, in order >>>>> to free those unused pages under memory pressure. So remove the >>>>> artificial requirement that the large folio needed to be at least >>>>> PMD-sized. >>>>> >>>>> Signed-off-by: Ryan Roberts >>>>> Reviewed-by: Yu Zhao >>>>> Reviewed-by: Yin Fengwei >>>>> --- >>>>> mm/rmap.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/mm/rmap.c b/mm/rmap.c >>>>> index 82ef5ba363d1..bbcb2308a1c5 100644 >>>>> --- a/mm/rmap.c >>>>> +++ b/mm/rmap.c >>>>> @@ -1474,7 +1474,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, >>>>> * page of the folio is unmapped and at least one page >>>>> * is still mapped. >>>>> */ >>>>> - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) >>>>> + if (folio_test_large(folio) && folio_test_anon(folio)) >>>>> if (!compound || nr < nr_pmdmapped) >>>>> deferred_split_folio(folio); >>>>> } >>>> >>>> One possible issue is that even for large folios mapped only in one >>>> process, in zap_pte_range(), we will always call deferred_split_folio() >>>> unnecessarily before freeing a large folio. >>> >>> Hi Huang, thanks for reviewing! >>> >>> I have a patch that solves this problem by determining a range of ptes covered >>> by a single folio and doing a "batch zap". This prevents the need to add the >>> folio to the deferred split queue, only to remove it again shortly afterwards. >>> This reduces lock contention and I can measure a performance improvement for the >>> kernel compilation benchmark. See [1]. >>> >>> However, I decided to remove it from this patch set on Yu Zhao's advice. We are >>> aiming for the minimal patch set to start with and wanted to focus people on >>> that. I intend to submit it separately later on. >>> >>> [1] https://lore.kernel.org/linux-mm/20230626171430.3167004-8-ryan.roberts@arm.com/ >> >> Thanks for your information! "batch zap" can solve the problem. >> >> And, I agree with Matthew's comments to fix the large folios interaction >> issues before merging the patches to allocate large folios as in the >> following email. >> >> https://lore.kernel.org/linux-mm/ZKVdUDuwNWDUCWc5@casper.infradead.org/ >> >> If so, we don't need to introduce the above problem or a large patchset. > > I appreciate Matthew's and others position about not wanting to merge a minimal > implementation while there are some fundamental features (e.g. compaction) it > doesn't play well with - I'm working to create a definitive list so these items > can be tracked and tackled. Good to know this, Thanks! > That said, I don't see this "batch zap" patch as an example of this. It's just a > performance enhancement that improves things even further than large anon folios > on their own. I'd rather concentrate on the core changes first then deal with > this type of thing later. Does that work for you? IIUC, allocating large folios upon page fault depends on splitting large folios in page_remove_rmap() to avoid memory wastage. Splitting large folios in page_remove_rmap() depends on "batch zap" to avoid performance regression in zap_pte_range(). So we need them to be done earlier. Or I miss something? Best Regards, Huang, Ying