Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp349163pxu; Thu, 3 Dec 2020 01:40:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJxFNy+X/x9xYjZaQ6juWIsmbGvHALzqqIcCrnCqLrCZNARJqTmEeTXw/dLJM54XykB6wBRC X-Received: by 2002:a50:d78f:: with SMTP id w15mr2005303edi.227.1606988413379; Thu, 03 Dec 2020 01:40:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606988413; cv=none; d=google.com; s=arc-20160816; b=RbHd2cGkZT7ZXOPYdLSzXmq3pbbHXxqhxDu8NnFGK5Udsi0+J5m7cL+VagAL7HNkxD t5nM1ukUSIEr9k6lgtGjNupKc/CZTJUKhusLb+W6cdzNi5TW4I3aorWoXIqB81g35RrA 2kJ5TYFUdO7VRe3bfOaqgmihqRnUzqk05AkObGwp8L9D/lk+lVsgsG/rc84eFmq2xW0F 9sIUYi1esSpoIvyT6YzH+rxDkJqgfueAQdk/iBGx5yJ16rwHot3j4ePmXDTXIaazrv8t AgwKwg2bU8Dwf7o6uDniyh4cKiMQDSBYJnOY6VPTW4Wpm8U/16C5/N/2h0j5CxMArf6i meZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=ITBMu3s9SOA9TWBLT7Dv1CFE7X7NLsOeXzeYpO3yx5M=; b=u9fqd8DxQsKFqhgjvsJcNQfXuLQJ7mkF+Uew+mN1lheft5JRE/NOcSDXHMP9qUrVP7 NGG3DzWLvhFCj/qbPoKYYyeHKlMHmtXHWj/x8mdf6JbQioIY61heTb//25Jt9x7FQnPH lkzzaH5q5KSBS9DHHVYTW3qj7S2D5ZepfemwIK90LVHI6ssL2rloiyTdeMlDUxGNAmrh HgsY68Gl6flqdJqMCy0leDvx85vkfw3rECfSjJNfrWbAjin/fxAF6pd3Ri26OihqsCXX QXVkLlRtdCtxCcsjMKuARo+JsaHTtHI0sTh96VHcRKQlPX2zOqW5XfF+LOvEzekLlR8E 6/Lw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y3si744548ejd.505.2020.12.03.01.39.50; Thu, 03 Dec 2020 01:40:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730122AbgLCJiY (ORCPT + 99 others); Thu, 3 Dec 2020 04:38:24 -0500 Received: from mx2.suse.de ([195.135.220.15]:45076 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729955AbgLCJiY (ORCPT ); Thu, 3 Dec 2020 04:38:24 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 84187AC55; Thu, 3 Dec 2020 09:37:42 +0000 (UTC) Date: Thu, 3 Dec 2020 09:37:39 +0000 From: Mel Gorman To: "Huang, Ying" Cc: Peter Zijlstra , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Matthew Wilcox (Oracle)" , Rafael Aquini , Andrew Morton , Ingo Molnar , Rik van Riel , Johannes Weiner , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes , linux-api@vger.kernel.org Subject: Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Message-ID: <20201203093739.GB3306@suse.de> References: <20201202084234.15797-1-ying.huang@intel.com> <20201202084234.15797-3-ying.huang@intel.com> <20201202114357.GW3306@suse.de> <87ft4npskx.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <87ft4npskx.fsf@yhuang-dev.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 03, 2020 at 09:49:02AM +0800, Huang, Ying wrote: > >> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 > >> index 68011eecb..3754b3e12 100644 > >> --- a/man2/set_mempolicy.2 > >> +++ b/man2/set_mempolicy.2 > >> @@ -113,6 +113,12 @@ A nonempty > >> .I nodemask > >> specifies node IDs that are relative to the set of > >> node IDs allowed by the process's current cpuset. > >> +.TP > >> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)" > >> +Enable the Linux kernel NUMA balancing for the task if it is supported > >> +by kernel. > >> +If the flag isn't supported by Linux kernel, return -1 and errno is > >> +set to EINVAL. > >> .PP > >> .I nodemask > >> points to a bit mask of node IDs that contains up to > >> @@ -293,6 +299,9 @@ argument specified both > > > > Should this be expanded more to clarify it applies to MPOL_BIND > > specifically? > > > > Maybe the first patch should be expanded more and explictly fail if > > MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND? > > For MPOL_PREFERRED, why could we not use NUMA balancing to migrate pages > to the accessing local node if it is same as the preferred node? You could but the kernel patch does not do that by making preferred_nid stick to the preferred node when hinting faults are trapped on that VMA. It would have to be a separate patch coupled with a man page update. If you wanted to go in this direction in the future, then the patch should explicitly return an error *now* if MPOL_PREFERRED is or'd with MPOL_F_NUMA_BALANCING so that an application becomes aware of MPOL_F_NUMA_BALANCING then it can detect if support exists in the current running kernel. > Even for MPOL_INTERLEAVE, if the target node is the same as the > accessing local node, can we use NUMA balancing to migrate pages? > The intent of MPOL_INTERLEAVE is to average the costs of the memory access so the average cost across the VMA is roughly similar across the entire range. This may be particularly important if the VMA is shared between multiple threads that are spread out on multiple nodes. A change in semantics there should be clearly documented. Similar, if you want to go in this direction, MPOL_F_NUMA_BALANCING should be chcked against MPOL_INTERLEAVE and explicitly fail now so suport can be detected at runtime. > So, I prefer to make MPOL_F_NUMA_BALANCING to be > > Optimizing with NUMA balancing if possible, and we may add more > optimization in the future. > Maybe, but I think it's best that the actual behaviour of the kernel is documented instead of desired behaviour or future planning. -- Mel Gorman SUSE Labs