Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp558418rdb; Tue, 19 Sep 2023 03:53:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHLmQteKtNFbqNap0hLkzMDobednVLcrd+nYooAd7PAt3XAPW4oAuy74D051o5duxvbuu3y X-Received: by 2002:a17:90b:a13:b0:26d:5049:cf48 with SMTP id gg19-20020a17090b0a1300b0026d5049cf48mr9709173pjb.40.1695120779732; Tue, 19 Sep 2023 03:52:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695120779; cv=none; d=google.com; s=arc-20160816; b=LATSXstQFIR06GAeaKBtk/4DeuMS1F/mSK5Su3PeOUsH5boXf/m5io2EmG+1wAgVs5 in+Wd7VkaMrR/w8vrmQRscZszKtIXCGjVEFj9GZFL+8gR4NhxAZ4+IGwlnYM+NGm0+W1 cEcB8pYXvfQlsCGAn9lF178aBDZanElKbHk4jhsAppYB+7qK8VD0cEBzohxTA73M0rXX 8WPTS9GYqrvWXfWQocMCW+5pxiwZ5lUkWDZBpEQqWMUoPa+DSbdRw13ARh+4NmOa7q7k Bwy1ITUPvXVHmx3+qz2ux/ugBeEsdbq7bVKsJXcwEjmH4iZe6NC1HxjlllkZfeG1cqYr VS9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=Ls2ZZMvVwgmjUtmPi6DuO17VMZDQ7A1qHqqRp8+oGbE=; fh=f+PnwsEj59MPYKhMMUWnvhP71gzFLrvb0ACJLgAdD+g=; b=QPuOzQSfa3ZQ9jW19nqLQGrrzigbD5odzQdw41gDY0C/XUsbcKVoTUe6GceXeCdxvf Q4fBAp4QcDgBRf53DO/6N3j8JOzSfnebDya/CCwN5pm9y8eIBCGHFUMegolUzF0GjDYW Ap9UrG6rWi8MeLJE4e/x7NTiuJKteF6mH1NIRcNZvdUAYa8suO1hPGHbVQdHH5xwiVuo EQbr8frCwXIfLdt/+Q47JZuNIGzuFzV/2WkSQMSRu3YRyM5RhnDdC+E7yo1zCGjQJyBQ JIYWU7zzqbsHpwAoL0DaIJnqBhENh2cKE5AIsUqUfDUb6XPKRXvDJSFCiaHDzRFsZBO6 +YSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=ld+aoIfW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id c8-20020a17090a674800b002685065230asi11811149pjm.37.2023.09.19.03.52.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 03:52:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=ld+aoIfW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id C1222806BE34; Tue, 19 Sep 2023 01:42:48 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231618AbjISImd (ORCPT + 99 others); Tue, 19 Sep 2023 04:42:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230151AbjISImc (ORCPT ); Tue, 19 Sep 2023 04:42:32 -0400 Received: from out-220.mta0.migadu.com (out-220.mta0.migadu.com [IPv6:2001:41d0:1004:224b::dc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B14511A for ; Tue, 19 Sep 2023 01:42:26 -0700 (PDT) Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695112944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ls2ZZMvVwgmjUtmPi6DuO17VMZDQ7A1qHqqRp8+oGbE=; b=ld+aoIfWLjD4jtVFQ4ikiJay9wseeCxjYZHaqQ8XD/OGLmM8NOEXT+HtsvYMMSXzl9dPUX dAKMAT1manK1lnjGIaGm4D/ikqQjzkONzmB67Y8BO4V/MW6/yPEXcRLxWcZluc/9D572Ue MGQ3KFqOWRillrdCHe4k/C703VzMnHs= Mime-Version: 1.0 Subject: Re: [PATCH v4 6/8] hugetlb: batch PMD split for bulk vmemmap dedup X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: Date: Tue, 19 Sep 2023 16:41:44 +0800 Cc: Mike Kravetz , Muchun Song , Oscar Salvador , David Hildenbrand , Miaohe Lin , David Rientjes , Anshuman Khandual , Naoya Horiguchi , Barry Song <21cnbao@gmail.com>, Michal Hocko , Matthew Wilcox , Xiongchun Duan , Linux-MM , Andrew Morton , LKML Content-Transfer-Encoding: quoted-printable Message-Id: <07192BE2-C66E-4F74-8F76-05F57777C6B7@linux.dev> References: <20230918230202.254631-1-mike.kravetz@oracle.com> <20230918230202.254631-7-mike.kravetz@oracle.com> <9c627733-e6a2-833b-b0f9-d59552f6ab0d@linux.dev> To: Joao Martins X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 19 Sep 2023 01:42:49 -0700 (PDT) > On Sep 19, 2023, at 16:26, Joao Martins = wrote: >=20 > On 19/09/2023 07:42, Muchun Song wrote: >> On 2023/9/19 07:01, Mike Kravetz wrote: >>> From: Joao Martins >>>=20 >>> In an effort to minimize amount of TLB flushes, batch all PMD splits >>> belonging to a range of pages in order to perform only 1 (global) = TLB >>> flush. >>>=20 >>> Add a flags field to the walker and pass whether it's a bulk = allocation >>> or just a single page to decide to remap. First value >>> (VMEMMAP_SPLIT_NO_TLB_FLUSH) designates the request to not do the = TLB >>> flush when we split the PMD. >>>=20 >>> Rebased and updated by Mike Kravetz >>>=20 >>> Signed-off-by: Joao Martins >>> Signed-off-by: Mike Kravetz >>> --- >>> mm/hugetlb_vmemmap.c | 79 = +++++++++++++++++++++++++++++++++++++++++--- >>> 1 file changed, 75 insertions(+), 4 deletions(-) >>>=20 >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index 147ed15bcae4..e8bc2f7567db 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -27,6 +27,7 @@ >>> * @reuse_addr: the virtual address of the @reuse_page page. >>> * @vmemmap_pages: the list head of the vmemmap pages that can = be freed >>> * or is mapped from. >>> + * @flags: used to modify behavior in bulk operations >>=20 >> Better to describe it as "used to modify behavior in vmemmap page = table walking >> operations" >>=20 > OK >=20 >>> void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct = list_head >>> *folio_list) >>> { >>> struct folio *folio; >>> LIST_HEAD(vmemmap_pages); >>> + list_for_each_entry(folio, folio_list, lru) >>> + hugetlb_vmemmap_split(h, &folio->page); >>> + >>> + flush_tlb_all(); >>> + >>> list_for_each_entry(folio, folio_list, lru) { >>> int ret =3D __hugetlb_vmemmap_optimize(h, &folio->page, >>> &vmemmap_pages); >>=20 >> This is unlikely to be failed since the page table allocation >> is moved to the above=20 >=20 >> (Note that the head vmemmap page allocation >> is not mandatory).=20 >=20 > Good point that I almost forgot >=20 >> So we should handle the error case in the above >> splitting operation. >=20 > But back to the previous discussion in v2... the thinking was that = /some/ PMDs > got split, and say could allow some PTE remapping to occur and free = some pages > back (each page allows 6 more splits worst case). Then the next > __hugetlb_vmemmap_optimize() will have to split PMD pages again for = those > hugepages that failed the batch PMD split (as we only defer the PTE = remap tlb > flush in this stage). Oh, yes. Maybe we could break the above traversal as early as possible once we enter an ENOMEM? >=20 > Unless this isn't something worth handling >=20 > Joao