Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754623AbbFSJBM (ORCPT ); Fri, 19 Jun 2015 05:01:12 -0400 Received: from e28smtp04.in.ibm.com ([122.248.162.4]:57069 "EHLO e28smtp04.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753311AbbFSJBD (ORCPT ); Fri, 19 Jun 2015 05:01:03 -0400 X-Helo: d28dlp02.in.ibm.com X-MailFrom: aneesh.kumar@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org From: "Aneesh Kumar K.V" To: Vlastimil Babka , Andrew Morton , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Vlastimil Babka , David Rientjes , "Kirill A. Shutemov" , Andrea Arcangeli , Michal Hocko Subject: Re: [PATCH] mm, thp: respect MPOL_PREFERRED policy with non-local node In-Reply-To: <1434639273-9527-1-git-send-email-vbabka@suse.cz> References: <1434639273-9527-1-git-send-email-vbabka@suse.cz> User-Agent: Notmuch/0.19+103~g294bb6d (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu) Date: Fri, 19 Jun 2015 14:30:43 +0530 Message-ID: <871th89io4.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15061909-0013-0000-0000-000005CF0A2A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2300 Lines: 51 Vlastimil Babka writes: > Since commit 077fcf116c8c ("mm/thp: allocate transparent hugepages on local > node"), we handle THP allocations on page fault in a special way - for > non-interleave memory policies, the allocation is only attempted on the node > local to the current CPU, if the policy's nodemask allows the node. > > This is motivated by the assumption that THP benefits cannot offset the cost > of remote accesses, so it's better to fallback to base pages on the local node > (which might still be available, while huge pages are not due to > fragmentation) than to allocate huge pages on a remote node. > > The nodemask check prevents us from violating e.g. MPOL_BIND policies where > the local node is not among the allowed nodes. However, the current > implementation can still give surprising results for the MPOL_PREFERRED policy > when the preferred node is different than the current CPU's local node. > > In such case we should honor the preferred node and not use the local node, > which is what this patch does. If hugepage allocation on the preferred node > fails, we fall back to base pages and don't try other nodes, with the same > motivation as is done for the local node hugepage allocations. > The patch also moves the MPOL_INTERLEAVE check around to simplify the hugepage > specific test. > > The difference can be demonstrated using in-tree transhuge-stress test on the > following 2-node machine where half memory on one node was occupied to show > the difference. > > ..... > Without -p parameter, hugepage restriction to CPU-local node works as before. > > Fixes: 077fcf116c8c ("mm/thp: allocate transparent hugepages on local node") > Signed-off-by: Vlastimil Babka > Cc: Aneesh Kumar K.V > Cc: David Rientjes > Cc: Kirill A. Shutemov > Cc: Andrea Arcangeli > Cc: Michal Hocko Reviewed-by: Aneesh Kumar K.V -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/