Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp2321768pxt; Sun, 8 Aug 2021 19:49:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4iEOJ+95q1pSZ6s+O/KGm5IlSbh3XxVEEnhOlyLV+dGQSJiGuKBXVivJ1FhT6KiJI5xPr X-Received: by 2002:a17:906:38ce:: with SMTP id r14mr20321590ejd.268.1628477386038; Sun, 08 Aug 2021 19:49:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628477386; cv=none; d=google.com; s=arc-20160816; b=X+1VymMcBnmcFfLUkSpLBkVf7JdWomoAT+g06Rmc3G90TcJBb2vhmXiz+bWJeEI8fh k5UWJrG80w291nZk3SIvJaBXirvjwDrkHApGL0vcS6WXFbsIlmWh8Nwf3ucdSLxDsp6L qGvZieE8tYmLgvs6fxtr/8JwnTYF8R1zJlNeJoaFCo7E8sPfHtyKiDZGetCZM9UJG7sb 2SSLVBJbPjbwGAI2bk7kvPzjEjf+RFablGhB5wmmE32lRPlrFeW11QJBIajZsoX1co4q RNmCYv/sLmdrJm0Pn70oI781VXPZE6XLcDZbM88KQNSt580le1M+CREdbOvBKgCPdfK3 f2jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=AR9CIDe09oM4vgLVtvkOkpva45qR+dOtOVkiVJF/ZR8=; b=d0BmYKUJSq1GIXzeUDBDF44GXDhpUc1MZR29s/P4yJk61k/2CuaRh988SR56WOe9Dw buZOYBP9zvcQp2MtAX9fyxtY/errBV7nkcdGtT9kcCU4Z3goTPy/oCKEBU5Uf8uIyr0f Fv57GhPtnzAFJh3gMHa21msauGUgA+U0z8hpInI5dW3hFYqapxYT408BbH0GzmzQCyty P0nzNUxBy5s8hZs4eCvqkwl+32YIQCck9UdylovZS6MI+RYqaSbT+JfNZ4hIbfUSINhh /AH/wBXiObrJroCPQhK5QjGy7LbkXHv6onn6iUfatx80bV6/UtieEZAqQAJjm2/YBWLn cWJA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p17si19022465edq.486.2021.08.08.19.49.22; Sun, 08 Aug 2021 19:49:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232208AbhHICo4 (ORCPT + 99 others); Sun, 8 Aug 2021 22:44:56 -0400 Received: from mga02.intel.com ([134.134.136.20]:40775 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229942AbhHICoz (ORCPT ); Sun, 8 Aug 2021 22:44:55 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10070"; a="201790018" X-IronPort-AV: E=Sophos;i="5.84,305,1620716400"; d="scan'208";a="201790018" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2021 19:44:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,305,1620716400"; d="scan'208";a="525015065" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.151]) by fmsmga002.fm.intel.com with ESMTP; 08 Aug 2021 19:44:31 -0700 Date: Mon, 9 Aug 2021 10:44:30 +0800 From: Feng Tang To: Michal Hocko Cc: linux-mm@kvack.org, Andrew Morton , David Rientjes , Dave Hansen , Ben Widawsky , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Andrea Arcangeli , Mel Gorman , Mike Kravetz , Randy Dunlap , Vlastimil Babka , Andi Kleen , Dan Williams , ying.huang@intel.com Subject: Re: [PATCH v7 3/5] mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY Message-ID: <20210809024430.GA46432@shbuild999.sh.intel.com> References: <1627970362-61305-1-git-send-email-feng.tang@intel.com> <1627970362-61305-4-git-send-email-feng.tang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Michal, Thanks for the review and ACKs to 1/5 and 2/5 patches. On Fri, Aug 06, 2021 at 03:35:48PM +0200, Michal Hocko wrote: > On Tue 03-08-21 13:59:20, Feng Tang wrote: > > From: Ben Widawsky > > > > Implement the missing huge page allocation functionality while obeying > > the preferred node semantics. This is similar to the implementation > > for general page allocation, as it uses a fallback mechanism to try > > multiple preferred nodes first, and then all other nodes. > > > > [akpm: fix compling issue when merging with other hugetlb patch] > > [Thanks to 0day bot for catching the missing #ifdef CONFIG_NUMA issue] > > Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com > > Suggested-by: Michal Hocko > > Signed-off-by: Ben Widawsky > > Co-developed-by: Feng Tang > > Signed-off-by: Feng Tang > > ifdefery is just ugly as hell. One way to get rid of that would be to > provide a mpol_is_preferred_many() wrapper and hide the CONFIG_NUMA in > mempolicy.h. I haven't checked but this might help to remove some other > ifdefery as well. > > I especially dislike the label hidden in the ifdef. You can get rid of > that by checking the page for NULL. Yes, the 'ifdef's were annoying to me too, and thanks for the suggestions. Following is the revised patch upon the suggestion. Thanks, Feng -------8<--------------------- From fc30718c40f02ba5ea73456af49173e66b5032c1 Mon Sep 17 00:00:00 2001 From: Ben Widawsky Date: Thu, 5 Aug 2021 23:01:11 -0400 Subject: [PATCH] mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY Implement the missing huge page allocation functionality while obeying the preferred node semantics. This is similar to the implementation for general page allocation, as it uses a fallback mechanism to try multiple preferred nodes first, and then all other nodes. To avoid adding too many "#ifdef CONFIG_NUMA" check, add a helper function in mempolicy.h to check whether a mempolicy is MPOL_PREFERRED_MANY. [akpm: fix compling issue when merging with other hugetlb patch] [Thanks to 0day bot for catching the !CONFIG_NUMA compiling issue] [Michal Hocko: suggest to remove the #ifdef CONFIG_NUMA check] Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com Link: https://lkml.kernel.org/r/1627970362-61305-4-git-send-email-feng.tang@intel.com Suggested-by: Michal Hocko Signed-off-by: Ben Widawsky Co-developed-by: Feng Tang Signed-off-by: Feng Tang -- include/linux/mempolicy.h | 12 ++++++++++++ mm/hugetlb.c | 28 ++++++++++++++++++++++++---- 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h index 0117e1e..60d5e6c 100644 --- a/include/linux/mempolicy.h +++ b/include/linux/mempolicy.h @@ -187,6 +187,12 @@ extern void mpol_put_task_policy(struct task_struct *); extern bool numa_demotion_enabled; +static inline bool mpol_is_preferred_many(struct mempolicy *pol) +{ + return (pol->mode == MPOL_PREFERRED_MANY); +} + + #else struct mempolicy {}; @@ -297,5 +303,11 @@ static inline nodemask_t *policy_nodemask_current(gfp_t gfp) } #define numa_demotion_enabled false + +static inline bool mpol_is_preferred_many(struct mempolicy *pol) +{ + return false; +} + #endif /* CONFIG_NUMA */ #endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 95714fb..75ea8bc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1145,7 +1145,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, unsigned long address, int avoid_reserve, long chg) { - struct page *page; + struct page *page = NULL; struct mempolicy *mpol; gfp_t gfp_mask; nodemask_t *nodemask; @@ -1166,7 +1166,17 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, gfp_mask = htlb_alloc_mask(h); nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); - page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + + if (mpol_is_preferred_many(mpol)) { + page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + + /* Fallback to all nodes if page==NULL */ + nodemask = NULL; + } + + if (!page) + page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + if (page && !avoid_reserve && vma_has_reserves(vma, chg)) { SetHPageRestoreReserve(page); h->resv_huge_pages--; @@ -2147,9 +2157,19 @@ struct page *alloc_buddy_huge_page_with_mpol(struct hstate *h, nodemask_t *nodemask; nid = huge_node(vma, addr, gfp_mask, &mpol, &nodemask); - page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask, false); - mpol_cond_put(mpol); + if (mpol_is_preferred_many(mpol)) { + gfp_t gfp = gfp_mask | __GFP_NOWARN; + gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); + page = alloc_surplus_huge_page(h, gfp, nid, nodemask, false); + + /* Fallback to all nodes if page==NULL */ + nodemask = NULL; + } + + if (!page) + page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask, false); + mpol_cond_put(mpol); return page; } -- 2.7.4