Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1118372ybe; Wed, 4 Sep 2019 12:56:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqw/Q2MgXlbjc8y9tPSHuyqriN1JVhO8bt8h0ynaZ5a7X7JOjzhGxRq3P11b7hlNoh9972Fo X-Received: by 2002:a17:902:a586:: with SMTP id az6mr39753189plb.298.1567626979359; Wed, 04 Sep 2019 12:56:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567626979; cv=none; d=google.com; s=arc-20160816; b=PFFSZrKc9xY5SsCFvVyana/YvI9cVR6qQpwYvCFI+9TTBy/dZqkpwskh+gO/geeQlk +QfxSVamYnR5Sn9KJzErY32ZND1ym08M4djxW9c4Ma9t+4i8z6igOPVCkcHnaUnh776b snVEbG4RgFZg2Eocfk98RYu6YiXQTo9WQFRvJzXyYHBxqzFw/RVrc4fkpeBsu4kwfS8x NUYugm9JNwPFvQuQoJ+9P8MD0xbFl8oAKAAMdRv9GuRiC/y4E+u+tyMjMyTtyxShUngd +ZlSJsDeQmxEhtvVdrAa9bTlJnlcNRKqFYwpkFHaCb0WY9M7lkuDwxhP7lYPXMQvydjG ZO1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :subject:cc:to:from:date:dkim-signature; bh=/fcTM79WZf6aDvSyEB7v9KgQnwaVScj2r1F/kgx7mCM=; b=G81cSZv9F0bGq6MlqdNm4CJZyvi1TBP8kbSQRoIO4czjcikV481rdS+eyM9wOccjdB vIhjrWbqhN0WUj07V8Skjmq7YqbM4kWXc/JVNcAhs0xlvcfB7u/pYLhwXrc2WnLjZf8j 4kG3nyNfgaElgbWoLZ95RHJBXwAubMuhU/2bidjZysMm6dOTEQsXO3jlHhDAh1bEJ4oB oOGB4Gofdut5yjm5h3MzcMk4crMJJUjmmChrN8V312HFJv/Po/zCq04dI3DJissZpWSp t3eg5kXIfghYTeZ+LuCVZp6r4oJN4uhneXEXb3KDIwyaCTi10pvVttjQa7Qm4Z1d9B94 Ytrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E3V36Igk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o18si13373064pll.302.2019.09.04.12.56.03; Wed, 04 Sep 2019 12:56:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E3V36Igk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730454AbfIDTya (ORCPT + 99 others); Wed, 4 Sep 2019 15:54:30 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:37574 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730418AbfIDTy1 (ORCPT ); Wed, 4 Sep 2019 15:54:27 -0400 Received: by mail-pl1-f195.google.com with SMTP id b10so41614plr.4 for ; Wed, 04 Sep 2019 12:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:user-agent:mime-version; bh=/fcTM79WZf6aDvSyEB7v9KgQnwaVScj2r1F/kgx7mCM=; b=E3V36Igk37DlL+J+dNYTpk/5GulrcMjXTraH6Ngs+1wvAnXOx31Cox17KsuPBS7LwS /EwobeVXzDpp5HZRc6SYmYbcJ8PFJ6kTVzu86VTyAp2FJl2PsaajqWu2hY4sWkvGts+a SgmZuJHXD58+tgFD+VMwSmk0P7tTVZwYiBNn3CfSxtIRzkKNJhSkYUVNiPZiJmS/bclx lY9UzFHAaQDRht4CDe8lp9VOmP8kK/Vd/Md1apXnk23QxeYAYkiOH88Isx85bfdS0va9 qZqc3Pab7GpN9kcL1nLhYDiS0u71EEyedAHOTAWzT807RgsRB/1r9nH30uSry4hIh6lp ndjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:user-agent :mime-version; bh=/fcTM79WZf6aDvSyEB7v9KgQnwaVScj2r1F/kgx7mCM=; b=L+gmiR8Uv+LTUmqrQvJ5fIrFkAKefg9ZXvQVegAkx0Vqap36eu9nwcLH2+uaNxQudH 8ndDejP1dYBRb0TiNbkznhfLO7cGsIrqPrWMtCUv9hmf9QXnkqDiM/ceKtYDHCkCFb6g oOj0PpcfDRWvDea03x4PyvbGmaYZtz+smzX3z+LwNCNHmKE+3cfnJJLlQIWk/WgmEgDi lponTzKxsTTL0nYh+axIa31RKQudjVGqivACXre6OG5iaEFcLOpurVvoCloqcQov10W5 qcUyjf1Uhzn4NqKEWruhVOIOpvexdbo4Z81/8CHgMkyuQGyeLN5VrdBM8kEcZp2gOd+v bonQ== X-Gm-Message-State: APjAAAXYraH76VrrkCrYEzChN/qc1qtGTnQx8IMkNuE1+F7gUr4kG/27 H2dZVk6KP8Egtta6P07LtDgbaQ== X-Received: by 2002:a17:902:b7cb:: with SMTP id v11mr23076612plz.153.1567626866334; Wed, 04 Sep 2019 12:54:26 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id j2sm6631739pfe.130.2019.09.04.12.54.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2019 12:54:25 -0700 (PDT) Date: Wed, 4 Sep 2019 12:54:25 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Linus Torvalds , Andrew Morton cc: Andrea Arcangeli , Michal Hocko , Mel Gorman , Vlastimil Babka , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [rfc 4/4] mm, page_alloc: allow hugepage fallback to remote nodes when madvised Message-ID: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org For systems configured to always try hard to allocate transparent hugepages (thp defrag setting of "always") or for memory that has been explicitly madvised to MADV_HUGEPAGE, it is often better to fallback to remote memory to allocate the hugepage if the local allocation fails first. The point is to allow the initial call to __alloc_pages_node() to attempt to defragment local memory to make a hugepage available, if possible, rather than immediately fallback to remote memory. Local hugepages will always have a better access latency than remote (huge)pages, so an attempt to make a hugepage available locally is always preferred. If memory compaction cannot be successful locally, however, it is likely better to fallback to remote memory. This could take on two forms: either allow immediate fallback to remote memory or do per-zone watermark checks. It would be possible to fallback only when per-zone watermarks fail for order-0 memory, since that would require local reclaim for all subsequent faults so remote huge allocation is likely better than thrashing the local zone for large workloads. In this case, it is assumed that because the system is configured to try hard to allocate hugepages or the vma is advised to explicitly want to try hard for hugepages that remote allocation is better when local allocation and memory compaction have both failed. Signed-off-by: David Rientjes --- mm/mempolicy.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/mempolicy.c b/mm/mempolicy.c --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2133,6 +2133,17 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma, mpol_cond_put(pol); page = __alloc_pages_node(hpage_node, gfp | __GFP_THISNODE, order); + + /* + * If hugepage allocations are configured to always + * synchronous compact or the vma has been madvised + * to prefer hugepage backing, retry allowing remote + * memory as well. + */ + if (!page && (gfp & __GFP_DIRECT_RECLAIM)) + page = __alloc_pages_node(hpage_node, + gfp | __GFP_NORETRY, order); + goto out; } }