Subject: Re: [PATCH 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

Well the difference between MPOL_DEFAULT and MPOL_LOCAL may be confusing.
Mention somewhere in the MPOL_LOCAL description that the policy with
MPOL_DEFAULT reverts to the policy of the process and MPOL_LOCAL to try to
allocate local? Note that MPOL_LOCAL also will not be local if it just
happens that the local node is overallocated. This is usually confusing
for newcomers. The node/zone reclaim must be activated in order to allow
node local reclaim that results in a node local allocation.




2016-10-09 18:57:31

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH v2 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

The MPOL_LOCAL mode has been implemented by
Peter Zijlstra <[email protected]>
(commit: 479e2802d09f1e18a97262c4c6f8f17ae5884bd8).
Add the documentation for this mode.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
This version adds more details about MPOL_LOCAL mode:
1. difference between MPOL_LOCAL and MPOL_DEFAULT
2. what if local node is overallocated or not allowed by the cpuset
---
man2/mbind.2 | 28 ++++++++++++++++++++++++----
man2/set_mempolicy.2 | 19 ++++++++++++++++++-
2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/man2/mbind.2 b/man2/mbind.2
index 3ea24f6..1dbda1e 100644
--- a/man2/mbind.2
+++ b/man2/mbind.2
@@ -130,8 +130,9 @@ argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
All policy modes except
.B MPOL_DEFAULT
require the caller to specify via the
@@ -258,9 +259,26 @@ and
.I maxnode
arguments specify the empty set, then the memory is allocated on
the node of the CPU that triggered the allocation.
-This is the only way to specify "local allocation" for a
-range of memory via
-.BR mbind ().
+
+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever the memory for this node will be released. If the
+"local node" is not allowed by the process's current cpuset context
+the kernel will try to allocate memory from other nodes. The kernel
+will allocate memory from the "local node" whenever it becomes
+allowed by the process's current cpuset context. In contrast
+.B MPOL_DEFAULT
+reverts to the policy of the process which may have been set with
+.BR set_mempolicy (2).
+It may not be the "local allocation".

If
.B MPOL_MF_STRICT
@@ -440,6 +458,8 @@ To select explicit "local allocation" for a memory range,
specify a
.I mode
of
+.B MPOL_LOCAL
+or
.B MPOL_PREFERRED
with an empty set of nodes.
This method will work for
diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
index 1f02037..3592734 100644
--- a/man2/set_mempolicy.2
+++ b/man2/set_mempolicy.2
@@ -79,8 +79,9 @@ argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
All modes except
.B MPOL_DEFAULT
require the caller to specify via the
@@ -211,6 +212,22 @@ arguments specify the empty set, then the policy
specifies "local allocation"
(like the system default policy discussed above).

+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever the memory for this node will be released. If the
+"local node" is not allowed by the process's current cpuset context
+the kernel will try to allocate memory from other nodes. The kernel
+will allocate memory from the "local node" whenever it becomes
+allowed by the process's current cpuset context.
+
The thread memory policy is preserved across an
.BR execve (2),
and is inherited by child threads created using
--
2.10.0

Subject: Re: [PATCH v2 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

On Sun, 9 Oct 2016, Piotr Kwapulinski wrote:

> +arguments must specify the empty set. If the "local node" is low
> +on free memory the kernel will try to allocate memory from other
> +nodes. The kernel will allocate memory from the "local node"
> +whenever the memory for this node will be released. If the

"whenever memory for this node is available"?

Otherwise

Reviewed-by: Christoph Lameter <[email protected]>

2016-10-10 16:23:28

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

The MPOL_LOCAL mode has been implemented by
Peter Zijlstra <[email protected]>
(commit: 479e2802d09f1e18a97262c4c6f8f17ae5884bd8).
Add the documentation for this mode.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
This version fixes grammar
---
man2/mbind.2 | 28 ++++++++++++++++++++++++----
man2/set_mempolicy.2 | 19 ++++++++++++++++++-
2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/man2/mbind.2 b/man2/mbind.2
index 3ea24f6..854580c 100644
--- a/man2/mbind.2
+++ b/man2/mbind.2
@@ -130,8 +130,9 @@ argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
All policy modes except
.B MPOL_DEFAULT
require the caller to specify via the
@@ -258,9 +259,26 @@ and
.I maxnode
arguments specify the empty set, then the memory is allocated on
the node of the CPU that triggered the allocation.
-This is the only way to specify "local allocation" for a
-range of memory via
-.BR mbind ().
+
+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever memory for this node is available. If the "local node"
+is not allowed by the process's current cpuset context the kernel
+will try to allocate memory from other nodes. The kernel will
+allocate memory from the "local node" whenever it becomes allowed
+by the process's current cpuset context. In contrast
+.B MPOL_DEFAULT
+reverts to the policy of the process which may have been set with
+.BR set_mempolicy (2).
+It may not be the "local allocation".

If
.B MPOL_MF_STRICT
@@ -440,6 +458,8 @@ To select explicit "local allocation" for a memory range,
specify a
.I mode
of
+.B MPOL_LOCAL
+or
.B MPOL_PREFERRED
with an empty set of nodes.
This method will work for
diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
index 1f02037..22b0f7c 100644
--- a/man2/set_mempolicy.2
+++ b/man2/set_mempolicy.2
@@ -79,8 +79,9 @@ argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.BR MPOL_INTERLEAVE ,
+.BR MPOL_PREFERRED ,
or
-.BR MPOL_PREFERRED .
+.BR MPOL_LOCAL .
All modes except
.B MPOL_DEFAULT
require the caller to specify via the
@@ -211,6 +212,22 @@ arguments specify the empty set, then the policy
specifies "local allocation"
(like the system default policy discussed above).

+.B MPOL_LOCAL
+specifies the "local allocation", the memory is allocated on
+the node of the CPU that triggered the allocation, "local node".
+The
+.I nodemask
+and
+.I maxnode
+arguments must specify the empty set. If the "local node" is low
+on free memory the kernel will try to allocate memory from other
+nodes. The kernel will allocate memory from the "local node"
+whenever memory for this node is available. If the "local node"
+is not allowed by the process's current cpuset context the kernel
+will try to allocate memory from other nodes. The kernel will
+allocate memory from the "local node" whenever it becomes allowed
+by the process's current cpuset context.
+
The thread memory policy is preserved across an
.BR execve (2),
and is inherited by child threads created using
--
2.10.0

Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

Hello Piotr,

On 10/10/2016 06:23 PM, Piotr Kwapulinski wrote:
> The MPOL_LOCAL mode has been implemented by
> Peter Zijlstra <[email protected]>
> (commit: 479e2802d09f1e18a97262c4c6f8f17ae5884bd8).
> Add the documentation for this mode.

Thanks. I've applied this patch. I have a question below.

> Signed-off-by: Piotr Kwapulinski <[email protected]>
> ---
> This version fixes grammar
> ---
> man2/mbind.2 | 28 ++++++++++++++++++++++++----
> man2/set_mempolicy.2 | 19 ++++++++++++++++++-
> 2 files changed, 42 insertions(+), 5 deletions(-)
>
> diff --git a/man2/mbind.2 b/man2/mbind.2
> index 3ea24f6..854580c 100644
> --- a/man2/mbind.2
> +++ b/man2/mbind.2
> @@ -130,8 +130,9 @@ argument must specify one of
> .BR MPOL_DEFAULT ,
> .BR MPOL_BIND ,
> .BR MPOL_INTERLEAVE ,
> +.BR MPOL_PREFERRED ,
> or
> -.BR MPOL_PREFERRED .
> +.BR MPOL_LOCAL .
> All policy modes except
> .B MPOL_DEFAULT
> require the caller to specify via the
> @@ -258,9 +259,26 @@ and
> .I maxnode
> arguments specify the empty set, then the memory is allocated on
> the node of the CPU that triggered the allocation.
> -This is the only way to specify "local allocation" for a
> -range of memory via
> -.BR mbind ().
> +
> +.B MPOL_LOCAL
> +specifies the "local allocation", the memory is allocated on
> +the node of the CPU that triggered the allocation, "local node".
> +The
> +.I nodemask
> +and
> +.I maxnode
> +arguments must specify the empty set. If the "local node" is low
> +on free memory the kernel will try to allocate memory from other
> +nodes. The kernel will allocate memory from the "local node"
> +whenever memory for this node is available. If the "local node"
> +is not allowed by the process's current cpuset context the kernel
> +will try to allocate memory from other nodes. The kernel will
> +allocate memory from the "local node" whenever it becomes allowed
> +by the process's current cpuset context. In contrast
> +.B MPOL_DEFAULT
> +reverts to the policy of the process which may have been set with
> +.BR set_mempolicy (2).
> +It may not be the "local allocation".

What is the sense of "may not be" here? (And repeated below).
Is the meaning "this could be something other than"?
Presumably the answer is yes, in which case I'll clarify
the wording there. Let me know.

Cheers,

Michael


>
> If
> .B MPOL_MF_STRICT
> @@ -440,6 +458,8 @@ To select explicit "local allocation" for a memory range,
> specify a
> .I mode
> of
> +.B MPOL_LOCAL
> +or
> .B MPOL_PREFERRED
> with an empty set of nodes.
> This method will work for
> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
> index 1f02037..22b0f7c 100644
> --- a/man2/set_mempolicy.2
> +++ b/man2/set_mempolicy.2
> @@ -79,8 +79,9 @@ argument must specify one of
> .BR MPOL_DEFAULT ,
> .BR MPOL_BIND ,
> .BR MPOL_INTERLEAVE ,
> +.BR MPOL_PREFERRED ,
> or
> -.BR MPOL_PREFERRED .
> +.BR MPOL_LOCAL .
> All modes except
> .B MPOL_DEFAULT
> require the caller to specify via the
> @@ -211,6 +212,22 @@ arguments specify the empty set, then the policy
> specifies "local allocation"
> (like the system default policy discussed above).
>
> +.B MPOL_LOCAL
> +specifies the "local allocation", the memory is allocated on
> +the node of the CPU that triggered the allocation, "local node".
> +The
> +.I nodemask
> +and
> +.I maxnode
> +arguments must specify the empty set. If the "local node" is low
> +on free memory the kernel will try to allocate memory from other
> +nodes. The kernel will allocate memory from the "local node"
> +whenever memory for this node is available. If the "local node"
> +is not allowed by the process's current cpuset context the kernel
> +will try to allocate memory from other nodes. The kernel will
> +allocate memory from the "local node" whenever it becomes allowed
> +by the process's current cpuset context.
> +
> The thread memory policy is preserved across an
> .BR execve (2),
> and is inherited by child threads created using
>


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

On Wed, 12 Oct 2016, Michael Kerrisk (man-pages) wrote:

> > +arguments must specify the empty set. If the "local node" is low
> > +on free memory the kernel will try to allocate memory from other
> > +nodes. The kernel will allocate memory from the "local node"
> > +whenever memory for this node is available. If the "local node"
> > +is not allowed by the process's current cpuset context the kernel
> > +will try to allocate memory from other nodes. The kernel will
> > +allocate memory from the "local node" whenever it becomes allowed
> > +by the process's current cpuset context. In contrast
> > +.B MPOL_DEFAULT
> > +reverts to the policy of the process which may have been set with
> > +.BR set_mempolicy (2).
> > +It may not be the "local allocation".
>
> What is the sense of "may not be" here? (And repeated below).
> Is the meaning "this could be something other than"?
> Presumably the answer is yes, in which case I'll clarify
> the wording there. Let me know.

Someone may have set for example a round robin policy with numactl
--interleave before starting the process? Then allocations will go through
all nodes.

Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

Hi Christoph,

On 12 October 2016 at 16:08, Christoph Lameter <[email protected]> wrote:
> On Wed, 12 Oct 2016, Michael Kerrisk (man-pages) wrote:
>
>> > +arguments must specify the empty set. If the "local node" is low
>> > +on free memory the kernel will try to allocate memory from other
>> > +nodes. The kernel will allocate memory from the "local node"
>> > +whenever memory for this node is available. If the "local node"
>> > +is not allowed by the process's current cpuset context the kernel
>> > +will try to allocate memory from other nodes. The kernel will
>> > +allocate memory from the "local node" whenever it becomes allowed
>> > +by the process's current cpuset context. In contrast
>> > +.B MPOL_DEFAULT
>> > +reverts to the policy of the process which may have been set with
>> > +.BR set_mempolicy (2).
>> > +It may not be the "local allocation".
>>
>> What is the sense of "may not be" here? (And repeated below).
>> Is the meaning "this could be something other than"?
>> Presumably the answer is yes, in which case I'll clarify
>> the wording there. Let me know.
>
> Someone may have set for example a round robin policy with numactl
> --interleave before starting the process? Then allocations will go through
> all nodes.

So the sense is then "this could be something other than", right?

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2016-10-12 15:53:25

by Piotr Kwapulinski

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

Hi Michael,

On Wed, Oct 12, 2016 at 09:55:16AM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Piotr,
>
> On 10/10/2016 06:23 PM, Piotr Kwapulinski wrote:
> > The MPOL_LOCAL mode has been implemented by
> > Peter Zijlstra <[email protected]>
> > (commit: 479e2802d09f1e18a97262c4c6f8f17ae5884bd8).
> > Add the documentation for this mode.
>
> Thanks. I've applied this patch. I have a question below.
>
> > Signed-off-by: Piotr Kwapulinski <[email protected]>
> > ---
> > This version fixes grammar
> > ---
> > man2/mbind.2 | 28 ++++++++++++++++++++++++----
> > man2/set_mempolicy.2 | 19 ++++++++++++++++++-
> > 2 files changed, 42 insertions(+), 5 deletions(-)
> >
> > diff --git a/man2/mbind.2 b/man2/mbind.2
> > index 3ea24f6..854580c 100644
> > --- a/man2/mbind.2
> > +++ b/man2/mbind.2
> > @@ -130,8 +130,9 @@ argument must specify one of
> > .BR MPOL_DEFAULT ,
> > .BR MPOL_BIND ,
> > .BR MPOL_INTERLEAVE ,
> > +.BR MPOL_PREFERRED ,
> > or
> > -.BR MPOL_PREFERRED .
> > +.BR MPOL_LOCAL .
> > All policy modes except
> > .B MPOL_DEFAULT
> > require the caller to specify via the
> > @@ -258,9 +259,26 @@ and
> > .I maxnode
> > arguments specify the empty set, then the memory is allocated on
> > the node of the CPU that triggered the allocation.
> > -This is the only way to specify "local allocation" for a
> > -range of memory via
> > -.BR mbind ().
> > +
> > +.B MPOL_LOCAL
> > +specifies the "local allocation", the memory is allocated on
> > +the node of the CPU that triggered the allocation, "local node".
> > +The
> > +.I nodemask
> > +and
> > +.I maxnode
> > +arguments must specify the empty set. If the "local node" is low
> > +on free memory the kernel will try to allocate memory from other
> > +nodes. The kernel will allocate memory from the "local node"
> > +whenever memory for this node is available. If the "local node"
> > +is not allowed by the process's current cpuset context the kernel
> > +will try to allocate memory from other nodes. The kernel will
> > +allocate memory from the "local node" whenever it becomes allowed
> > +by the process's current cpuset context. In contrast
> > +.B MPOL_DEFAULT
> > +reverts to the policy of the process which may have been set with
> > +.BR set_mempolicy (2).
> > +It may not be the "local allocation".
>
> What is the sense of "may not be" here? (And repeated below).
> Is the meaning "this could be something other than"?
> Presumably the answer is yes, in which case I'll clarify
> the wording there. Let me know.
>
> Cheers,
>
> Michael
>

That's right. This could be "local allocation" or any other memory policy.

Thanks
Piotr Kwapulinski

> >
> > If
> > .B MPOL_MF_STRICT
> > @@ -440,6 +458,8 @@ To select explicit "local allocation" for a memory range,
> > specify a
> > .I mode
> > of
> > +.B MPOL_LOCAL
> > +or
> > .B MPOL_PREFERRED
> > with an empty set of nodes.
> > This method will work for
> > diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
> > index 1f02037..22b0f7c 100644
> > --- a/man2/set_mempolicy.2
> > +++ b/man2/set_mempolicy.2
> > @@ -79,8 +79,9 @@ argument must specify one of
> > .BR MPOL_DEFAULT ,
> > .BR MPOL_BIND ,
> > .BR MPOL_INTERLEAVE ,
> > +.BR MPOL_PREFERRED ,
> > or
> > -.BR MPOL_PREFERRED .
> > +.BR MPOL_LOCAL .
> > All modes except
> > .B MPOL_DEFAULT
> > require the caller to specify via the
> > @@ -211,6 +212,22 @@ arguments specify the empty set, then the policy
> > specifies "local allocation"
> > (like the system default policy discussed above).
> >
> > +.B MPOL_LOCAL
> > +specifies the "local allocation", the memory is allocated on
> > +the node of the CPU that triggered the allocation, "local node".
> > +The
> > +.I nodemask
> > +and
> > +.I maxnode
> > +arguments must specify the empty set. If the "local node" is low
> > +on free memory the kernel will try to allocate memory from other
> > +nodes. The kernel will allocate memory from the "local node"
> > +whenever memory for this node is available. If the "local node"
> > +is not allowed by the process's current cpuset context the kernel
> > +will try to allocate memory from other nodes. The kernel will
> > +allocate memory from the "local node" whenever it becomes allowed
> > +by the process's current cpuset context.
> > +
> > The thread memory policy is preserved across an
> > .BR execve (2),
> > and is inherited by child threads created using
> >
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
--
Piotr Kwapulinski

Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

On Wed, 12 Oct 2016, Piotr Kwapulinski wrote:

> That's right. This could be "local allocation" or any other memory policy.

Correct.

Subject: Re: [PATCH v3 0/1] man/set_mempolicy.2,mbind.2: add MPOL_LOCAL NUMA memory policy documentation

On 10/12/2016 09:55 PM, Christoph Lameter wrote:
> On Wed, 12 Oct 2016, Piotr Kwapulinski wrote:
>
>> That's right. This could be "local allocation" or any other memory policy.
>
> Correct.
>

Thanks, Piotr and Christoph.

Cheers,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/