2017-09-15 21:38:11

by Mike Kravetz

[permalink] [raw]
Subject: [patch] mremap.2: Add description of old_size == 0 functionality

Since at least the 2.6 time frame, mremap would create a new mapping
of the same pages if 'old_size == 0'. It would also leave the original
mapping. This was used to create a 'duplicate mapping'.

Document the behavior and return codes. But, also mention that the
functionality is deprecated and discourage its use.

A recent change was made to mremap so that an attempt to create a
duplicate a private mapping will fail.

commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
Author: Mike Kravetz <[email protected]>
Date: Wed Sep 6 16:20:55 2017 -0700

mm/mremap: fail map duplication attempts for private mappings

This return code is also documented here.

Signed-off-by: Mike Kravetz <[email protected]>
---
man2/mremap.2 | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/man2/mremap.2 b/man2/mremap.2
index 98643c640..98df7d5fa 100644
--- a/man2/mremap.2
+++ b/man2/mremap.2
@@ -58,6 +58,21 @@ may be provided; see the description of
.B MREMAP_FIXED
below.
.PP
+If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
+a private anonymous mapping, then
+.BR mremap ()
+will create a new mapping of the same pages. \fInew_size\fP
+will be the size of the new mapping and the location of the new mapping
+may be specified with \fInew_address\fP, see the description of
+.B MREMAP_FIXED
+below. If a new mapping is requested via this method, then the
+.B MREMAP_MAYMOVE
+flag must also be specified. This functionality is deprecated, and no
+new code should be written to use this feature. A better method of
+obtaining multiple mappings of the same private anonymous memory is via the
+.BR memfd_create()
+system call.
+.PP
In Linux the memory is divided into pages.
A user process has (one or)
several linear virtual memory segments.
@@ -174,7 +189,12 @@ and
or
.B MREMAP_FIXED
was specified without also specifying
-.BR MREMAP_MAYMOVE .
+.BR MREMAP_MAYMOVE ;
+or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
+private anonymous mapping;
+or \fIold_size\fP was zero and the
+.BR MREMAP_MAYMOVE
+flag was not specified.
.TP
.B ENOMEM
The memory area cannot be expanded at the current virtual address, and the
@@ -210,6 +230,7 @@ if the area cannot be populated.
.BR brk (2),
.BR getpagesize (2),
.BR getrlimit (2),
+.BR memfd_create(2),
.BR mlock (2),
.BR mmap (2),
.BR sbrk (2),
--
2.13.5


2017-09-15 21:55:08

by Mike Kravetz

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

CC: linux-mm

On 09/15/2017 02:37 PM, Mike Kravetz wrote:
> Since at least the 2.6 time frame, mremap would create a new mapping
> of the same pages if 'old_size == 0'. It would also leave the original
> mapping. This was used to create a 'duplicate mapping'.
>
> Document the behavior and return codes. But, also mention that the
> functionality is deprecated and discourage its use.
>
> A recent change was made to mremap so that an attempt to create a
> duplicate a private mapping will fail.
>
> commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
> Author: Mike Kravetz <[email protected]>
> Date: Wed Sep 6 16:20:55 2017 -0700
>
> mm/mremap: fail map duplication attempts for private mappings
>
> This return code is also documented here.
>
> Signed-off-by: Mike Kravetz <[email protected]>
> ---
> man2/mremap.2 | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/man2/mremap.2 b/man2/mremap.2
> index 98643c640..98df7d5fa 100644
> --- a/man2/mremap.2
> +++ b/man2/mremap.2
> @@ -58,6 +58,21 @@ may be provided; see the description of
> .B MREMAP_FIXED
> below.
> .PP
> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
> +a private anonymous mapping, then
> +.BR mremap ()
> +will create a new mapping of the same pages. \fInew_size\fP
> +will be the size of the new mapping and the location of the new mapping
> +may be specified with \fInew_address\fP, see the description of
> +.B MREMAP_FIXED
> +below. If a new mapping is requested via this method, then the
> +.B MREMAP_MAYMOVE
> +flag must also be specified. This functionality is deprecated, and no
> +new code should be written to use this feature. A better method of
> +obtaining multiple mappings of the same private anonymous memory is via the
> +.BR memfd_create()
> +system call.
> +.PP
> In Linux the memory is divided into pages.
> A user process has (one or)
> several linear virtual memory segments.
> @@ -174,7 +189,12 @@ and
> or
> .B MREMAP_FIXED
> was specified without also specifying
> -.BR MREMAP_MAYMOVE .
> +.BR MREMAP_MAYMOVE ;
> +or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
> +private anonymous mapping;
> +or \fIold_size\fP was zero and the
> +.BR MREMAP_MAYMOVE
> +flag was not specified.
> .TP
> .B ENOMEM
> The memory area cannot be expanded at the current virtual address, and the
> @@ -210,6 +230,7 @@ if the area cannot be populated.
> .BR brk (2),
> .BR getpagesize (2),
> .BR getrlimit (2),
> +.BR memfd_create(2),
> .BR mlock (2),
> .BR mmap (2),
> .BR sbrk (2),
>

2017-09-18 01:53:09

by Jann Horn

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On Fri, Sep 15, 2017 at 2:37 PM, Mike Kravetz <[email protected]> wrote:
[...]
> A recent change was made to mremap so that an attempt to create a
> duplicate a private mapping will fail.
>
> commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
> Author: Mike Kravetz <[email protected]>
> Date: Wed Sep 6 16:20:55 2017 -0700
>
> mm/mremap: fail map duplication attempts for private mappings
>
> This return code is also documented here.
[...]
> diff --git a/man2/mremap.2 b/man2/mremap.2
[...]
> @@ -174,7 +189,12 @@ and
> or
> .B MREMAP_FIXED
> was specified without also specifying
> -.BR MREMAP_MAYMOVE .
> +.BR MREMAP_MAYMOVE ;
> +or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
> +private anonymous mapping;

Shouldn't this be the other way around? "or old_size was zero and
old_address refers to a private anonymous mapping"?

2017-09-18 13:45:43

by Florian Weimer

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/15/2017 11:53 PM, Mike Kravetz wrote:
> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
> +a private anonymous mapping, then
> +.BR mremap ()
> +will create a new mapping of the same pages. \fInew_size\fP
> +will be the size of the new mapping and the location of the new mapping
> +may be specified with \fInew_address\fP, see the description of
> +.B MREMAP_FIXED
> +below. If a new mapping is requested via this method, then the
> +.B MREMAP_MAYMOVE
> +flag must also be specified. This functionality is deprecated, and no
> +new code should be written to use this feature. A better method of
> +obtaining multiple mappings of the same private anonymous memory is via the
> +.BR memfd_create()
> +system call.

Is there any particular reason to deprecate this?

In glibc, we cannot use memfd_create and keep the file descriptor around
because the application can close descriptors beneath us.

(We might want to use alias mappings to avoid run-time code generation
for PLT-less LD_AUDIT interceptors.)

Thanks,
Florian

2017-09-18 17:12:50

by Mike Kravetz

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/18/2017 06:45 AM, Florian Weimer wrote:
> On 09/15/2017 11:53 PM, Mike Kravetz wrote:
>> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
>> +a private anonymous mapping, then
>> +.BR mremap ()
>> +will create a new mapping of the same pages. \fInew_size\fP
>> +will be the size of the new mapping and the location of the new mapping
>> +may be specified with \fInew_address\fP, see the description of
>> +.B MREMAP_FIXED
>> +below. If a new mapping is requested via this method, then the
>> +.B MREMAP_MAYMOVE
>> +flag must also be specified. This functionality is deprecated, and no
>> +new code should be written to use this feature. A better method of
>> +obtaining multiple mappings of the same private anonymous memory is via the
>> +.BR memfd_create()
>> +system call.
>
> Is there any particular reason to deprecate this?
>
> In glibc, we cannot use memfd_create and keep the file descriptor around because the application can close descriptors beneath us.
>
> (We might want to use alias mappings to avoid run-time code generation for PLT-less LD_AUDIT interceptors.)
>

Hi Florian,

When I brought up this mremap 'duplicate mapping' functionality on the mm
mail list, most developers were surprised. It seems this functionality exists
mostly 'by chance', and it was not really designed. It certainly was never
documented. There were suggestions to remove the functionality, which led
to my claim that it was being deprecated. However, in hindsight that may
have been too strong.

I can drop this wording, but would still like to suggest memfd_create as
the preferred method of creating duplicate mappings. It would be good if
others on Cc: could comment as well.

Just curious, does glibc make use of this today? Or, is this just something
that you think may be useful.

--
Mike Kravetz

2017-09-18 17:22:12

by Mike Kravetz

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/17/2017 06:52 PM, Jann Horn wrote:
> On Fri, Sep 15, 2017 at 2:37 PM, Mike Kravetz <[email protected]> wrote:
> [...]
>> A recent change was made to mremap so that an attempt to create a
>> duplicate a private mapping will fail.
>>
>> commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
>> Author: Mike Kravetz <[email protected]>
>> Date: Wed Sep 6 16:20:55 2017 -0700
>>
>> mm/mremap: fail map duplication attempts for private mappings
>>
>> This return code is also documented here.
> [...]
>> diff --git a/man2/mremap.2 b/man2/mremap.2
> [...]
>> @@ -174,7 +189,12 @@ and
>> or
>> .B MREMAP_FIXED
>> was specified without also specifying
>> -.BR MREMAP_MAYMOVE .
>> +.BR MREMAP_MAYMOVE ;
>> +or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
>> +private anonymous mapping;
>
> Shouldn't this be the other way around? "or old_size was zero and
> old_address refers to a private anonymous mapping"?

Thanks Jann,

Yes that is wrong. In addition, the description of this functionality
in the section before this is also incorrect.

I will fix both in a new version of the patch.

--
Mike Kravetz

2017-09-19 12:11:25

by Florian Weimer

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/18/2017 07:11 PM, Mike Kravetz wrote:
> On 09/18/2017 06:45 AM, Florian Weimer wrote:
>> On 09/15/2017 11:53 PM, Mike Kravetz wrote:
>>> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
>>> +a private anonymous mapping, then
>>> +.BR mremap ()
>>> +will create a new mapping of the same pages. \fInew_size\fP
>>> +will be the size of the new mapping and the location of the new mapping
>>> +may be specified with \fInew_address\fP, see the description of
>>> +.B MREMAP_FIXED
>>> +below. If a new mapping is requested via this method, then the
>>> +.B MREMAP_MAYMOVE
>>> +flag must also be specified. This functionality is deprecated, and no
>>> +new code should be written to use this feature. A better method of
>>> +obtaining multiple mappings of the same private anonymous memory is via the
>>> +.BR memfd_create()
>>> +system call.
>>
>> Is there any particular reason to deprecate this?
>>
>> In glibc, we cannot use memfd_create and keep the file descriptor around because the application can close descriptors beneath us.
>>
>> (We might want to use alias mappings to avoid run-time code generation for PLT-less LD_AUDIT interceptors.)
>>
>
> Hi Florian,
>
> When I brought up this mremap 'duplicate mapping' functionality on the mm
> mail list, most developers were surprised. It seems this functionality exists
> mostly 'by chance', and it was not really designed. It certainly was never
> documented. There were suggestions to remove the functionality, which led
> to my claim that it was being deprecated. However, in hindsight that may
> have been too strong.

This history is certainly a bit odd.

> I can drop this wording, but would still like to suggest memfd_create as
> the preferred method of creating duplicate mappings. It would be good if
> others on Cc: could comment as well.

mremap seems to work with non-anonymous mappings, too:

#include <err.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

/* Hopefully large enough to prevent crossing of a page boundary in
the implementation. */
__attribute__ ((aligned (256), noclone, noinline, weak))
int
callback (void)
{
return 17;
}

int
main (void)
{
long pagesize = sysconf (_SC_PAGESIZE);
if (pagesize < 0)
err (1, "sysconf");
uintptr_t addr = (uintptr_t) &callback;
addr = addr / pagesize * pagesize;
printf ("old function address: %p\n", &callback);
ptrdiff_t page_offset = (uintptr_t) &callback - addr;
void *newaddr = mremap ((void *) addr, 0, pagesize, MREMAP_MAYMOVE);
if (newaddr == MAP_FAILED)
err (1, "mremap");
if (memcmp ((void *) addr, newaddr, pagesize) != 0)
errx (1, "page contents differs");
int (*newfunc) (void) = newaddr + page_offset;
printf ("new function address: %p\n", newfunc);
if (newfunc () != 17)
errx (1, "invalid return value from newfunc");
if (callback () != 17)
errx (1, "invalid return value from callback");
return 0;
}

(The code needs adjustment for architectures where function pointers
point to a descriptor and not the actual code.)

This looks very useful for generating arbitrary callback wrappers
without actual run-time code generation. memfd_create would not work
for that.

> Just curious, does glibc make use of this today? Or, is this just something
> that you think may be useful.

To my knowledge, we do not use this today. But it certainly looks very
useful.

Thanks,
Florian

2017-09-19 21:44:01

by Mike Kravetz

[permalink] [raw]
Subject: [patch v2] mremap.2: Add description of old_size == 0 functionality

v2: Fix incorrect wording noticed by Jann Horn.
Remove deprecated and memfd_create discussion as suggested
by Florian Weimer.

Since at least the 2.6 time frame, mremap would create a new mapping
of the same pages if 'old_size == 0'. It would also leave the original
mapping. This was used to create a 'duplicate mapping'.

A recent change was made to mremap so that an attempt to create a
duplicate a private mapping will fail.

Document the 'old_size == 0' behavior and new return code from
below commit.

commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
Author: Mike Kravetz <[email protected]>
Date: Wed Sep 6 16:20:55 2017 -0700

mm/mremap: fail map duplication attempts for private mappings

Signed-off-by: Mike Kravetz <[email protected]>
---
man2/mremap.2 | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/man2/mremap.2 b/man2/mremap.2
index 98643c640..235984a96 100644
--- a/man2/mremap.2
+++ b/man2/mremap.2
@@ -58,6 +58,20 @@ may be provided; see the description of
.B MREMAP_FIXED
below.
.PP
+If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
+a shareable mapping (see
+.BR mmap (2)
+.BR MAP_SHARED )
+, then
+.BR mremap ()
+will create a new mapping of the same pages. \fInew_size\fP
+will be the size of the new mapping and the location of the new mapping
+may be specified with \fInew_address\fP, see the description of
+.B MREMAP_FIXED
+below. If a new mapping is requested via this method, then the
+.B MREMAP_MAYMOVE
+flag must also be specified.
+.PP
In Linux the memory is divided into pages.
A user process has (one or)
several linear virtual memory segments.
@@ -174,7 +188,12 @@ and
or
.B MREMAP_FIXED
was specified without also specifying
-.BR MREMAP_MAYMOVE .
+.BR MREMAP_MAYMOVE ;
+or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
+shareable mapping;
+or \fIold_size\fP was zero and the
+.BR MREMAP_MAYMOVE
+flag was not specified.
.TP
.B ENOMEM
The memory area cannot be expanded at the current virtual address, and the
--
2.13.5

Subject: Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

Hello Mike,

On 09/19/2017 11:42 PM, Mike Kravetz wrote:
> v2: Fix incorrect wording noticed by Jann Horn.
> Remove deprecated and memfd_create discussion as suggested
> by Florian Weimer.
>
> Since at least the 2.6 time frame, mremap would create a new mapping
> of the same pages if 'old_size == 0'. It would also leave the original
> mapping. This was used to create a 'duplicate mapping'.
>
> A recent change was made to mremap so that an attempt to create a
> duplicate a private mapping will fail.
>
> Document the 'old_size == 0' behavior and new return code from
> below commit.
>
> commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
> Author: Mike Kravetz <[email protected]>
> Date: Wed Sep 6 16:20:55 2017 -0700
>
> mm/mremap: fail map duplication attempts for private mappings
>
> Signed-off-by: Mike Kravetz <[email protected]>
> ---
> man2/mremap.2 | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/man2/mremap.2 b/man2/mremap.2
> index 98643c640..235984a96 100644
> --- a/man2/mremap.2
> +++ b/man2/mremap.2
> @@ -58,6 +58,20 @@ may be provided; see the description of
> .B MREMAP_FIXED
> below.
> .PP
> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
> +a shareable mapping (see
> +.BR mmap (2)
> +.BR MAP_SHARED )
> +, then
> +.BR mremap ()
> +will create a new mapping of the same pages. \fInew_size\fP
> +will be the size of the new mapping and the location of the new mapping
> +may be specified with \fInew_address\fP, see the description of
> +.B MREMAP_FIXED
> +below. If a new mapping is requested via this method, then the
> +.B MREMAP_MAYMOVE
> +flag must also be specified.
> +.PP
> In Linux the memory is divided into pages.
> A user process has (one or)
> several linear virtual memory segments.
> @@ -174,7 +188,12 @@ and
> or
> .B MREMAP_FIXED
> was specified without also specifying
> -.BR MREMAP_MAYMOVE .
> +.BR MREMAP_MAYMOVE ;
> +or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
> +shareable mapping;
> +or \fIold_size\fP was zero and the
> +.BR MREMAP_MAYMOVE
> +flag was not specified.
> .TP
> .B ENOMEM
> The memory area cannot be expanded at the current virtual address, and the

I've applied this, and added Reviewed-by tags for Florian and Jann.
But, I think it's also worth noting the older, now disallowed, behavior,
and why the behavior was changed. So I added a note in BUGS:

BUGS
Before Linux 4.14, if old_size was zero and the mapping referred
to by old_address was a private mapping (mmap(2) MAP_PRIVATE),
mremap() created a new private mapping unrelated to the original
mapping. This behavior was unintended and probably unexpected in
user-space applications (since the intention of mremap() is to
create a new mapping based on the original mapping). Since Linux
4.14, mremap() fails with the error EINVAL in this scenario.

Does that seem okay?

Cheers,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2017-09-25 12:35:13

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On Tue 19-09-17 14:11:19, Florian Weimer wrote:
> On 09/18/2017 07:11 PM, Mike Kravetz wrote:
[...]
> > I can drop this wording, but would still like to suggest memfd_create as
> > the preferred method of creating duplicate mappings. It would be good if
> > others on Cc: could comment as well.
>
> mremap seems to work with non-anonymous mappings, too:

only for shared mappings in fact. Because once we have CoW then mremap
will not provide you with the same content as the original mapping.

[...]

> > Just curious, does glibc make use of this today? Or, is this just something
> > that you think may be useful.
>
> To my knowledge, we do not use this today. But it certainly looks very
> useful.

What would be the usecase. I mean why don't you simply create a new
mapping by a plain mmap when you have no guarantee about the same
content?
--
Michal Hocko
SUSE Labs

2017-09-25 12:36:25

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

On Wed 20-09-17 09:25:42, Michael Kerrisk wrote:
[...]
> BUGS
> Before Linux 4.14, if old_size was zero and the mapping referred
> to by old_address was a private mapping (mmap(2) MAP_PRIVATE),
> mremap() created a new private mapping unrelated to the original
> mapping. This behavior was unintended and probably unexpected in
> user-space applications (since the intention of mremap() is to
> create a new mapping based on the original mapping). Since Linux
> 4.14, mremap() fails with the error EINVAL in this scenario.
>
> Does that seem okay?

sorry to be late but yes this wording makes perfect sense to me.
--
Michal Hocko
SUSE Labs

2017-09-25 12:40:49

by Florian Weimer

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/25/2017 02:35 PM, Michal Hocko wrote:
> What would be the usecase. I mean why don't you simply create a new
> mapping by a plain mmap when you have no guarantee about the same
> content?

I plan to use it for creating an unbounded number of callback thunks at
run time, from a single set of pages in libc.so, in case we need this
functionality.

The idea is to duplicate existing position-independent machine code in
libc.so, prefixed by a data mapping which controls its behavior. Each
data/code combination would only give us a fixed number of thunks, so
we'd need to create a new mapping to increase the total number.

Instead, we could re-map the code from the executable in disk, but not
if chroot has been called or glibc has been updated on disk. Creating
an alias mapping does not have these problems.

Another application (but that's for anonymous memory) would be to
duplicate class metadata in a Java-style VM, so that you can use bits in
the class pointer in each Java object (which is similar to the vtable
pointer in C++) for the garbage collector, without having to mask it
when accessing the class metadata in regular (mutator) code.

Thanks,
Florian

Subject: Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

On 25 September 2017 at 14:36, Michal Hocko <[email protected]> wrote:
> On Wed 20-09-17 09:25:42, Michael Kerrisk wrote:
> [...]
>> BUGS
>> Before Linux 4.14, if old_size was zero and the mapping referred
>> to by old_address was a private mapping (mmap(2) MAP_PRIVATE),
>> mremap() created a new private mapping unrelated to the original
>> mapping. This behavior was unintended and probably unexpected in
>> user-space applications (since the intention of mremap() is to
>> create a new mapping based on the original mapping). Since Linux
>> 4.14, mremap() fails with the error EINVAL in this scenario.
>>
>> Does that seem okay?
>
> sorry to be late but yes this wording makes perfect sense to me.

Thanks, Michal.

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2017-09-25 12:52:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On Mon 25-09-17 14:40:42, Florian Weimer wrote:
> On 09/25/2017 02:35 PM, Michal Hocko wrote:
> > What would be the usecase. I mean why don't you simply create a new
> > mapping by a plain mmap when you have no guarantee about the same
> > content?
>
> I plan to use it for creating an unbounded number of callback thunks at run
> time, from a single set of pages in libc.so, in case we need this
> functionality.
>
> The idea is to duplicate existing position-independent machine code in
> libc.so, prefixed by a data mapping which controls its behavior. Each
> data/code combination would only give us a fixed number of thunks, so we'd
> need to create a new mapping to increase the total number.
>
> Instead, we could re-map the code from the executable in disk, but not if
> chroot has been called or glibc has been updated on disk. Creating an alias
> mapping does not have these problems.
>
> Another application (but that's for anonymous memory) would be to duplicate
> class metadata in a Java-style VM, so that you can use bits in the class
> pointer in each Java object (which is similar to the vtable pointer in C++)
> for the garbage collector, without having to mask it when accessing the
> class metadata in regular (mutator) code.

So, how are you going to deal with the CoW and the implementation which
basically means that the newm mmap content is not the same as the
original one?
--
Michal Hocko
SUSE Labs

2017-09-25 13:16:24

by Florian Weimer

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/25/2017 02:52 PM, Michal Hocko wrote:
> So, how are you going to deal with the CoW and the implementation which
> basically means that the newm mmap content is not the same as the
> original one?

I don't understand why CoW would kick in. The approach I outlined is
desirable because it avoids the need to modify any executable pages, so
this is not a concern. The point is to create a potentially unbounded
number of thunks *without* run-time code generation.

If the file is rewritten on disk, that's already undefined today, so
it's not something we need to be concerned with. (Anything which
replaces ELF files needs to use the rename-into-place approach anyway.)

Thanks,
Florian

2017-09-25 14:52:43

by Michal Hocko

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On Mon 25-09-17 15:16:09, Florian Weimer wrote:
> On 09/25/2017 02:52 PM, Michal Hocko wrote:
> > So, how are you going to deal with the CoW and the implementation which
> > basically means that the newm mmap content is not the same as the
> > original one?
>
> I don't understand why CoW would kick in.

So you can guarantee nobody is going to write to that memory? Moreover
for the anonymous mapping you really get zero pages rather than the
original content AFAIR.

> The approach I outlined is
> desirable because it avoids the need to modify any executable pages, so this
> is not a concern. The point is to create a potentially unbounded number of
> thunks *without* run-time code generation.
>
> If the file is rewritten on disk, that's already undefined today, so it's
> not something we need to be concerned with. (Anything which replaces ELF
> files needs to use the rename-into-place approach anyway.)

Yeah that part is not all that interesting.
--
Michal Hocko
SUSE Labs

2017-09-25 14:54:46

by Florian Weimer

[permalink] [raw]
Subject: Re: [patch] mremap.2: Add description of old_size == 0 functionality

On 09/25/2017 04:52 PM, Michal Hocko wrote:
> On Mon 25-09-17 15:16:09, Florian Weimer wrote:
>> On 09/25/2017 02:52 PM, Michal Hocko wrote:
>>> So, how are you going to deal with the CoW and the implementation which
>>> basically means that the newm mmap content is not the same as the
>>> original one?
>>
>> I don't understand why CoW would kick in.
>
> So you can guarantee nobody is going to write to that memory?

It's mapped readable and executable, but not writable. So the only
thing that could interfere would be a debugger.

Thanks,
Florian

2017-09-25 16:34:42

by Mike Kravetz

[permalink] [raw]
Subject: Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

On 09/20/2017 12:25 AM, Michael Kerrisk (man-pages) wrote:
> Hello Mike,
>
> On 09/19/2017 11:42 PM, Mike Kravetz wrote:
>> v2: Fix incorrect wording noticed by Jann Horn.
>> Remove deprecated and memfd_create discussion as suggested
>> by Florian Weimer.
>>
>> Since at least the 2.6 time frame, mremap would create a new mapping
>> of the same pages if 'old_size == 0'. It would also leave the original
>> mapping. This was used to create a 'duplicate mapping'.
>>
>> A recent change was made to mremap so that an attempt to create a
>> duplicate a private mapping will fail.
>>
>> Document the 'old_size == 0' behavior and new return code from
>> below commit.
>>
>> commit dba58d3b8c5045ad89c1c95d33d01451e3964db7
>> Author: Mike Kravetz <[email protected]>
>> Date: Wed Sep 6 16:20:55 2017 -0700
>>
>> mm/mremap: fail map duplication attempts for private mappings
>>
>> Signed-off-by: Mike Kravetz <[email protected]>
>> ---
>> man2/mremap.2 | 21 ++++++++++++++++++++-
>> 1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/man2/mremap.2 b/man2/mremap.2
>> index 98643c640..235984a96 100644
>> --- a/man2/mremap.2
>> +++ b/man2/mremap.2
>> @@ -58,6 +58,20 @@ may be provided; see the description of
>> .B MREMAP_FIXED
>> below.
>> .PP
>> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to
>> +a shareable mapping (see
>> +.BR mmap (2)
>> +.BR MAP_SHARED )
>> +, then
>> +.BR mremap ()
>> +will create a new mapping of the same pages. \fInew_size\fP
>> +will be the size of the new mapping and the location of the new mapping
>> +may be specified with \fInew_address\fP, see the description of
>> +.B MREMAP_FIXED
>> +below. If a new mapping is requested via this method, then the
>> +.B MREMAP_MAYMOVE
>> +flag must also be specified.
>> +.PP
>> In Linux the memory is divided into pages.
>> A user process has (one or)
>> several linear virtual memory segments.
>> @@ -174,7 +188,12 @@ and
>> or
>> .B MREMAP_FIXED
>> was specified without also specifying
>> -.BR MREMAP_MAYMOVE .
>> +.BR MREMAP_MAYMOVE ;
>> +or \fIold_size\fP was zero and \fIold_address\fP does not refer to a
>> +shareable mapping;
>> +or \fIold_size\fP was zero and the
>> +.BR MREMAP_MAYMOVE
>> +flag was not specified.
>> .TP
>> .B ENOMEM
>> The memory area cannot be expanded at the current virtual address, and the
>
> I've applied this, and added Reviewed-by tags for Florian and Jann.
> But, I think it's also worth noting the older, now disallowed, behavior,
> and why the behavior was changed. So I added a note in BUGS:
>
> BUGS
> Before Linux 4.14, if old_size was zero and the mapping referred
> to by old_address was a private mapping (mmap(2) MAP_PRIVATE),
> mremap() created a new private mapping unrelated to the original
> mapping. This behavior was unintended and probably unexpected in
> user-space applications (since the intention of mremap() is to
> create a new mapping based on the original mapping). Since Linux
> 4.14, mremap() fails with the error EINVAL in this scenario.
>
> Does that seem okay?

Sorry for the late reply Michael, I've been away for a few days.

Yes, the above seems okay. Thanks for your help with this.

--
Mike Kravetz

Subject: Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

Hi Mike,

On 25 September 2017 at 18:33, Mike Kravetz <[email protected]> wrote:
> On 09/20/2017 12:25 AM, Michael Kerrisk (man-pages) wrote:

[...]

>> I've applied this, and added Reviewed-by tags for Florian and Jann.
>> But, I think it's also worth noting the older, now disallowed, behavior,
>> and why the behavior was changed. So I added a note in BUGS:
>>
>> BUGS
>> Before Linux 4.14, if old_size was zero and the mapping referred
>> to by old_address was a private mapping (mmap(2) MAP_PRIVATE),
>> mremap() created a new private mapping unrelated to the original
>> mapping. This behavior was unintended and probably unexpected in
>> user-space applications (since the intention of mremap() is to
>> create a new mapping based on the original mapping). Since Linux
>> 4.14, mremap() fails with the error EINVAL in this scenario.
>>
>> Does that seem okay?
>
> Sorry for the late reply Michael, I've been away for a few days.
>
> Yes, the above seems okay. Thanks for your help with this.

You're welcome. Thanks for checking it over!

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/