From: John Hubbard <[email protected]>
Previously, MAP_FIXED was "discouraged", due to portability
issues with the fixed address. In fact, there are other, more
serious issues. Also, in some limited cases, this option can
be used safely.
Expand the documentation to discuss both the hazards, and how
to use it safely.
The "Portability issues" wording is lifted directly from
Matthew Wilcox's review. The notes about other libraries
creating mappings is also from Matthew (lightly edited).
The suggestion to explain how to use MAP_FIXED safely is
from Jann Horn.
Suggested-by: Matthew Wilcox <[email protected]>
Suggested-by: Jann Horn <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
Changed from v1:
-- Covered topics recommended by Matthew Wilcox
and Jann Horn, in their recent review: the hazards
of overwriting pre-exising mappings, and some notes
about how to use MAP_FIXED safely.
-- Rewrote the commit description accordingly.
man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++--
1 file changed, 36 insertions(+), 2 deletions(-)
diff --git a/man2/mmap.2 b/man2/mmap.2
index 385f3bfd5..9038256d4 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded.
If the specified address cannot be used,
.BR mmap ()
will fail.
-Because requiring a fixed address for a mapping is less portable,
-the use of this option is discouraged.
+.IP
+This option is extremely hazardous (when used on its own) and moderately
+non-portable.
+.IP
+Portability issues: a process's memory map may change significantly from one
+run to the next, depending on library versions, kernel versions and random
+numbers.
+.IP
+Hazards: this option forcibly removes pre-existing mappings, making it easy
+for a multi-threaded process to corrupt its own address space.
+.IP
+For example, thread A looks through /proc/<pid>/maps and locates an available
+address range, while thread B simultaneously acquires part or all of that same
+address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting
+thread B's mapping.
+.IP
+Thread B need not create a mapping directly; simply making a library call
+that, internally, uses dlopen(3) to load some other shared library, will
+suffice. The dlopen(3) call will map the library into the process's address
+space. Furthermore, almost any library call may be implemented using this
+technique.
+Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries
+(http://www.linux-pam.org).
+.IP
+Given the above limitations, one of the very few ways to use this option
+safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
+region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
+portability problem (because the first mmap call lets the kernel pick the
+address), and the address space corruption problem (because the region being
+overwritten is already owned by the calling thread).
+.IP
+Newer kernels
+(Linux 4.16 and later) have a
+.B MAP_FIXED_SAFE
+option that avoids the corruption problem; if available, MAP_FIXED_SAFE
+should be preferred over MAP_FIXED.
.TP
.B MAP_GROWSDOWN
This flag is used for stacks.
--
2.15.1
Hi!
I know that we are not touching the rest of the existing description for
MAP_FIXED however the second sentence in the manual page says that "addr
must be a multiple of the page size." Which however is misleading as
this is not enough on some architectures. Code in the wild seems to
(mis)use SHMLBA for aligment purposes but I'm not sure that we should
advise something like that in the manpages.
So what about something as:
"addr must be suitably aligned, for most architectures multiple of page
size is sufficient, however some may impose additional restrictions for
page mapping addresses."
Which should at least hint the reader that this is architecture specific.
--
Cyril Hrubis
[email protected]
On Sun, Dec 03, 2017 at 06:14:11PM -0800, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> Previously, MAP_FIXED was "discouraged", due to portability
> issues with the fixed address. In fact, there are other, more
> serious issues. Also, in some limited cases, this option can
> be used safely.
>
> Expand the documentation to discuss both the hazards, and how
> to use it safely.
>
> The "Portability issues" wording is lifted directly from
> Matthew Wilcox's review. The notes about other libraries
> creating mappings is also from Matthew (lightly edited).
>
> The suggestion to explain how to use MAP_FIXED safely is
> from Jann Horn.
>
> Suggested-by: Matthew Wilcox <[email protected]>
> Suggested-by: Jann Horn <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
>
> Changed from v1:
>
> -- Covered topics recommended by Matthew Wilcox
> and Jann Horn, in their recent review: the hazards
> of overwriting pre-exising mappings, and some notes
> about how to use MAP_FIXED safely.
>
> -- Rewrote the commit description accordingly.
>
> man2/mmap.2 | 38 ++++++++++++++++++++++++++++++++++++--
> 1 file changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index 385f3bfd5..9038256d4 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -222,8 +222,42 @@ part of the existing mapping(s) will be discarded.
> If the specified address cannot be used,
> .BR mmap ()
> will fail.
> -Because requiring a fixed address for a mapping is less portable,
> -the use of this option is discouraged.
> +.IP
> +This option is extremely hazardous (when used on its own) and moderately
> +non-portable.
> +.IP
> +Portability issues: a process's memory map may change significantly from one
> +run to the next, depending on library versions, kernel versions and random
> +numbers.
> +.IP
> +Hazards: this option forcibly removes pre-existing mappings, making it easy
> +for a multi-threaded process to corrupt its own address space.
> +.IP
> +For example, thread A looks through /proc/<pid>/maps and locates an available
> +address range, while thread B simultaneously acquires part or all of that same
> +address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting
> +thread B's mapping.
> +.IP
> +Thread B need not create a mapping directly; simply making a library call
> +that, internally, uses dlopen(3) to load some other shared library, will
> +suffice. The dlopen(3) call will map the library into the process's address
> +space. Furthermore, almost any library call may be implemented using this
> +technique.
> +Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries
> +(http://www.linux-pam.org).
> +.IP
> +Given the above limitations, one of the very few ways to use this option
> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
> +portability problem (because the first mmap call lets the kernel pick the
> +address), and the address space corruption problem (because the region being
> +overwritten is already owned by the calling thread).
Maybe "address space corruption problem caused by implicit calls to mmap"?
The region allocated with the first mmap is not exactly owned by the
thread and a multi-thread application can still corrupt its memory if
different threads use mmap(MAP_FIXED) for overlapping regions.
My 2 cents.
> +.IP
> +Newer kernels
> +(Linux 4.16 and later) have a
> +.B MAP_FIXED_SAFE
> +option that avoids the corruption problem; if available, MAP_FIXED_SAFE
> +should be preferred over MAP_FIXED.
> .TP
> .B MAP_GROWSDOWN
> This flag is used for stacks.
> --
> 2.15.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>
--
Sincerely yours,
Mike.
On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> Hi!
> I know that we are not touching the rest of the existing description for
> MAP_FIXED however the second sentence in the manual page says that "addr
> must be a multiple of the page size." Which however is misleading as
> this is not enough on some architectures. Code in the wild seems to
> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> advise something like that in the manpages.
>
> So what about something as:
>
> "addr must be suitably aligned, for most architectures multiple of page
> size is sufficient, however some may impose additional restrictions for
> page mapping addresses."
>
Hi Cyril,
Right, so I've been looking into this today, and I think we can go a bit
further than that, even. The kernel, as far back as the *original* git
commit in 2005, implements mmap on ARM by requiring that the address is
aligned to SHMLBA:
arch/arm/mm/mmap.c:50:
if (flags & MAP_FIXED) {
if (aliasing && flags & MAP_SHARED &&
(addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
return -EINVAL;
return addr;
}
So, given that this has been the implementation for the last 12+ years (and
probably the whole time, in fact), I think we can be bold enough to use this
wording for the second sentence of MAP_FIXED:
"addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
the system page size (on many architectures) or a multiple of the system
page size (on some architectures)."
What do you think?
thanks,
John Hubbard
NVIDIA
> Which should at least hint the reader that this is architecture specific.
>
On 12/04/2017 03:31 AM, Mike Rapoport wrote:
> On Sun, Dec 03, 2017 at 06:14:11PM -0800, [email protected] wrote:
>> From: John Hubbard <[email protected]>
>>
[...]
>> +.IP
>> +Given the above limitations, one of the very few ways to use this option
>> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
>> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
>> +portability problem (because the first mmap call lets the kernel pick the
>> +address), and the address space corruption problem (because the region being
>> +overwritten is already owned by the calling thread).
>
> Maybe "address space corruption problem caused by implicit calls to mmap"?
> The region allocated with the first mmap is not exactly owned by the
> thread and a multi-thread application can still corrupt its memory if
> different threads use mmap(MAP_FIXED) for overlapping regions.
>
> My 2 cents.
>
Hi Mike,
Yes, thanks for picking through this, and I agree that the above is misleading.
It should definitely not use the word "owned" at all. Re-doing the whole
paragraph in order to make it all fit together nicely, I get this:
"Given the above limitations, one of the very few ways to use this option
safely is: mmap() an enclosing region, without specifying MAP_FIXED.
Then, within that region, call mmap(MAP_FIXED) to suballocate regions
within the enclosing region. This avoids both the portability problem
(because the first mmap call lets the kernel pick the address), and the
address space corruption problem (because implicit calls to mmap will
not affect the already-mapped enclosing region)."
...how's that sound to you? I'll post a v3 soon with this.
thanks,
John Hubbard
NVIDIA
On Mon 04-12-17 18:14:18, John Hubbard wrote:
> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> > Hi!
> > I know that we are not touching the rest of the existing description for
> > MAP_FIXED however the second sentence in the manual page says that "addr
> > must be a multiple of the page size." Which however is misleading as
> > this is not enough on some architectures. Code in the wild seems to
> > (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> > advise something like that in the manpages.
> >
> > So what about something as:
> >
> > "addr must be suitably aligned, for most architectures multiple of page
> > size is sufficient, however some may impose additional restrictions for
> > page mapping addresses."
> >
>
> Hi Cyril,
>
> Right, so I've been looking into this today, and I think we can go a bit
> further than that, even. The kernel, as far back as the *original* git
> commit in 2005, implements mmap on ARM by requiring that the address is
> aligned to SHMLBA:
>
> arch/arm/mm/mmap.c:50:
>
> if (flags & MAP_FIXED) {
> if (aliasing && flags & MAP_SHARED &&
> (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
> return -EINVAL;
> return addr;
> }
>
> So, given that this has been the implementation for the last 12+ years (and
> probably the whole time, in fact), I think we can be bold enough to use this
> wording for the second sentence of MAP_FIXED:
>
> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
> the system page size (on many architectures) or a multiple of the system
> page size (on some architectures)."
>
> What do you think?
I am not sure this is a good idea. This is pulling way too many
implementation details into the man page IMHO. Note that your wording is
even incorrect because this applies only to shared mappings and on some
architectures it even requires special memory regions. We do not want
all that in the man page...
--
Michal Hocko
SUSE Labs
On Mon 04-12-17 18:52:27, John Hubbard wrote:
> On 12/04/2017 03:31 AM, Mike Rapoport wrote:
> > On Sun, Dec 03, 2017 at 06:14:11PM -0800, [email protected] wrote:
> >> From: John Hubbard <[email protected]>
> >>
> [...]
> >> +.IP
> >> +Given the above limitations, one of the very few ways to use this option
> >> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
> >> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
> >> +portability problem (because the first mmap call lets the kernel pick the
> >> +address), and the address space corruption problem (because the region being
> >> +overwritten is already owned by the calling thread).
> >
> > Maybe "address space corruption problem caused by implicit calls to mmap"?
> > The region allocated with the first mmap is not exactly owned by the
> > thread and a multi-thread application can still corrupt its memory if
> > different threads use mmap(MAP_FIXED) for overlapping regions.
> >
> > My 2 cents.
> >
>
> Hi Mike,
>
> Yes, thanks for picking through this, and I agree that the above is misleading.
> It should definitely not use the word "owned" at all. Re-doing the whole
> paragraph in order to make it all fit together nicely, I get this:
>
> "Given the above limitations, one of the very few ways to use this option
> safely is: mmap() an enclosing region, without specifying MAP_FIXED.
> Then, within that region, call mmap(MAP_FIXED) to suballocate regions
> within the enclosing region. This avoids both the portability problem
> (because the first mmap call lets the kernel pick the address), and the
> address space corruption problem (because implicit calls to mmap will
> not affect the already-mapped enclosing region)."
>
> ...how's that sound to you? I'll post a v3 soon with this.
It sounds to me you are trying to tell way to much while actually being
a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe.
Really the more corner cases you will try to cover the worse the end
result will end up. I would just try to be simple here and mention the
address space corruption issues you've had earlier and be done with it.
Maybe add a note that some architectures might need a special alignement
and fail if it is not the case but nothing really specific.
--
Michal Hocko
SUSE Labs
On 12/04/2017 11:05 PM, Michal Hocko wrote:
> On Mon 04-12-17 18:14:18, John Hubbard wrote:
>> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
>>> Hi!
>>> I know that we are not touching the rest of the existing description for
>>> MAP_FIXED however the second sentence in the manual page says that "addr
>>> must be a multiple of the page size." Which however is misleading as
>>> this is not enough on some architectures. Code in the wild seems to
>>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
>>> advise something like that in the manpages.
>>>
>>> So what about something as:
>>>
>>> "addr must be suitably aligned, for most architectures multiple of page
>>> size is sufficient, however some may impose additional restrictions for
>>> page mapping addresses."
>>>
>>
>> Hi Cyril,
>>
>> Right, so I've been looking into this today, and I think we can go a bit
>> further than that, even. The kernel, as far back as the *original* git
>> commit in 2005, implements mmap on ARM by requiring that the address is
>> aligned to SHMLBA:
>>
>> arch/arm/mm/mmap.c:50:
>>
>> if (flags & MAP_FIXED) {
>> if (aliasing && flags & MAP_SHARED &&
>> (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
>> return -EINVAL;
>> return addr;
>> }
>>
>> So, given that this has been the implementation for the last 12+ years (and
>> probably the whole time, in fact), I think we can be bold enough to use this
>> wording for the second sentence of MAP_FIXED:
>>
>> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
>> the system page size (on many architectures) or a multiple of the system
>> page size (on some architectures)."
>>
>> What do you think?
>
> I am not sure this is a good idea. This is pulling way too many
> implementation details into the man page IMHO. Note that your wording is
> even incorrect because this applies only to shared mappings and on some
> architectures it even requires special memory regions. We do not want
> all that in the man page...
>
Hi Michal,
OK, so it sounds like Cyril's original wording would be just about right,
after all, like this?
"addr must be suitably aligned. For most architectures multiple of page
size is sufficient; however, some may impose additional restrictions."
(It does seem unfortunate that the man page cannot help the programmer
actually write correct code here. He or she is forced to read the kernel
implementation, in order to figure out the true alignment rules. I was
hoping we could avoid that.)
thanks,
John Hubbard
NVIDIA
On 12/04/2017 11:08 PM, Michal Hocko wrote:
> On Mon 04-12-17 18:52:27, John Hubbard wrote:
>> On 12/04/2017 03:31 AM, Mike Rapoport wrote:
>>> On Sun, Dec 03, 2017 at 06:14:11PM -0800, [email protected] wrote:
>>>> From: John Hubbard <[email protected]>
>>>>
>> [...]
>>>> +.IP
>>>> +Given the above limitations, one of the very few ways to use this option
>>>> +safely is: mmap() a region, without specifying MAP_FIXED. Then, within that
>>>> +region, call mmap(MAP_FIXED) to suballocate regions. This avoids both the
>>>> +portability problem (because the first mmap call lets the kernel pick the
>>>> +address), and the address space corruption problem (because the region being
>>>> +overwritten is already owned by the calling thread).
>>>
>>> Maybe "address space corruption problem caused by implicit calls to mmap"?
>>> The region allocated with the first mmap is not exactly owned by the
>>> thread and a multi-thread application can still corrupt its memory if
>>> different threads use mmap(MAP_FIXED) for overlapping regions.
>>>
>>> My 2 cents.
>>>
>>
>> Hi Mike,
>>
>> Yes, thanks for picking through this, and I agree that the above is misleading.
>> It should definitely not use the word "owned" at all. Re-doing the whole
>> paragraph in order to make it all fit together nicely, I get this:
>>
>> "Given the above limitations, one of the very few ways to use this option
>> safely is: mmap() an enclosing region, without specifying MAP_FIXED.
>> Then, within that region, call mmap(MAP_FIXED) to suballocate regions
>> within the enclosing region. This avoids both the portability problem
>> (because the first mmap call lets the kernel pick the address), and the
>> address space corruption problem (because implicit calls to mmap will
>> not affect the already-mapped enclosing region)."
>>
>> ...how's that sound to you? I'll post a v3 soon with this.
>
> It sounds to me you are trying to tell way to much while actually being
> a bit misleading. Even sub-range MAP_FIXED is not multi-thread safe.
>
> Really the more corner cases you will try to cover the worse the end
> result will end up. I would just try to be simple here and mention the
> address space corruption issues you've had earlier and be done with it.
> Maybe add a note that some architectures might need a special alignement
> and fail if it is not the case but nothing really specific.
>
Sure, I can drop the "how to use this safely" section. It seemed like a good
idea at the time... :)
thanks,
John Hubbard
NVIDIA
On Mon 04-12-17 23:42:00, John Hubbard wrote:
> On 12/04/2017 11:05 PM, Michal Hocko wrote:
> > On Mon 04-12-17 18:14:18, John Hubbard wrote:
> >> On 12/04/2017 02:55 AM, Cyril Hrubis wrote:
> >>> Hi!
> >>> I know that we are not touching the rest of the existing description for
> >>> MAP_FIXED however the second sentence in the manual page says that "addr
> >>> must be a multiple of the page size." Which however is misleading as
> >>> this is not enough on some architectures. Code in the wild seems to
> >>> (mis)use SHMLBA for aligment purposes but I'm not sure that we should
> >>> advise something like that in the manpages.
> >>>
> >>> So what about something as:
> >>>
> >>> "addr must be suitably aligned, for most architectures multiple of page
> >>> size is sufficient, however some may impose additional restrictions for
> >>> page mapping addresses."
> >>>
> >>
> >> Hi Cyril,
> >>
> >> Right, so I've been looking into this today, and I think we can go a bit
> >> further than that, even. The kernel, as far back as the *original* git
> >> commit in 2005, implements mmap on ARM by requiring that the address is
> >> aligned to SHMLBA:
> >>
> >> arch/arm/mm/mmap.c:50:
> >>
> >> if (flags & MAP_FIXED) {
> >> if (aliasing && flags & MAP_SHARED &&
> >> (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
> >> return -EINVAL;
> >> return addr;
> >> }
> >>
> >> So, given that this has been the implementation for the last 12+ years (and
> >> probably the whole time, in fact), I think we can be bold enough to use this
> >> wording for the second sentence of MAP_FIXED:
> >>
> >> "addr must be a multiple of SHMLBA (<sys/shm.h>), which in turn is either
> >> the system page size (on many architectures) or a multiple of the system
> >> page size (on some architectures)."
> >>
> >> What do you think?
> >
> > I am not sure this is a good idea. This is pulling way too many
> > implementation details into the man page IMHO. Note that your wording is
> > even incorrect because this applies only to shared mappings and on some
> > architectures it even requires special memory regions. We do not want
> > all that in the man page...
> >
>
> Hi Michal,
>
> OK, so it sounds like Cyril's original wording would be just about right,
> after all, like this?
>
> "addr must be suitably aligned. For most architectures multiple of page
> size is sufficient; however, some may impose additional restrictions."
>
> (It does seem unfortunate that the man page cannot help the programmer
> actually write correct code here. He or she is forced to read the kernel
> implementation, in order to figure out the true alignment rules. I was
> hoping we could avoid that.)
I strongly suspect that this is more the architecture than the kernel
implementation imposed restriction.
--
Michal Hocko
SUSE Labs
Hi!
> (It does seem unfortunate that the man page cannot help the programmer
> actually write correct code here. He or she is forced to read the kernel
> implementation, in order to figure out the true alignment rules. I was
> hoping we could avoid that.)
It would be nice if we had this information exported somehere so that we
do not have to rely on per-architecture ifdefs.
What about adding MapAligment or something similar to the /proc/meminfo?
--
Cyril Hrubis
[email protected]
On 12/06/2017 02:01 AM, Cyril Hrubis wrote:
> Hi!
>> (It does seem unfortunate that the man page cannot help the programmer
>> actually write correct code here. He or she is forced to read the kernel
>> implementation, in order to figure out the true alignment rules. I was
>> hoping we could avoid that.)
>
> It would be nice if we had this information exported somehere so that we
> do not have to rely on per-architecture ifdefs.
>
> What about adding MapAligment or something similar to the /proc/meminfo?
>
What's the use case you envision for that? I don't see how that would be
better than using SHMLBA, which is available at compiler time. Because
unless someone expects to be able to run an app that was compiled for
Arch X, on Arch Y (surely that's not requirement here?), I don't see how
the run-time check is any better.
Or maybe you're thinking that since the SHMLBA cannot be put in the man
pages, we could instead provide MapAlignment as sort of a different
way to document the requirement?
--
thanks,
John Hubbard
NVIDIA
On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
> Hi!
> > >> (It does seem unfortunate that the man page cannot help the programmer
> > >> actually write correct code here. He or she is forced to read the kernel
> > >> implementation, in order to figure out the true alignment rules. I was
> > >> hoping we could avoid that.)
> > >
> > > It would be nice if we had this information exported somehere so that we
> > > do not have to rely on per-architecture ifdefs.
> > >
> > > What about adding MapAligment or something similar to the /proc/meminfo?
> > >
> >
> > What's the use case you envision for that? I don't see how that would be
> > better than using SHMLBA, which is available at compiler time. Because
> > unless someone expects to be able to run an app that was compiled for
> > Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> > the run-time check is any better.
>
> I guess that some kind of compile time constant in uapi headers will do
> as well, I'm really open to any solution that would expose this constant
> as some kind of official API.
I am not sure this is really feasible. It is not only a simple alignment
thing. Look at ppc for example (slice_get_unmapped_area). Other
architectures might have even more complicated rules e.g. arm and its
cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
backed mappings.
I would really leave dogs sleeping... Trying to document all this in the
man page has chances to confuse more people than it has chances to help
those who already know all these nasty details.
--
Michal Hocko
SUSE Labs
Hi!
> >> (It does seem unfortunate that the man page cannot help the programmer
> >> actually write correct code here. He or she is forced to read the kernel
> >> implementation, in order to figure out the true alignment rules. I was
> >> hoping we could avoid that.)
> >
> > It would be nice if we had this information exported somehere so that we
> > do not have to rely on per-architecture ifdefs.
> >
> > What about adding MapAligment or something similar to the /proc/meminfo?
> >
>
> What's the use case you envision for that? I don't see how that would be
> better than using SHMLBA, which is available at compiler time. Because
> unless someone expects to be able to run an app that was compiled for
> Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> the run-time check is any better.
I guess that some kind of compile time constant in uapi headers will do
as well, I'm really open to any solution that would expose this constant
as some kind of official API.
There is one problem with using SHMLBA, there are more than one libc
implementations and at least two of them (bionic and klibc) does not
implement the SysV IPC at all. I know that the chances that you are
writing a program that requires MAP_FIXED, is compiled against
alternative libc, and expected to run on less common architectures are
pretty slim. On the other hand I do not see a reason why this cannot be
done properly, this is just about exporting one simple constant to
userspace after all.
> Or maybe you're thinking that since the SHMLBA cannot be put in the man
> pages, we could instead provide MapAlignment as sort of a different
> way to document the requirement?
This is my intention as well.
--
Cyril Hrubis
[email protected]
On Thu 2017-12-07 15:02:21, Michal Hocko wrote:
> On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
> > Hi!
> > > >> (It does seem unfortunate that the man page cannot help the programmer
> > > >> actually write correct code here. He or she is forced to read the kernel
> > > >> implementation, in order to figure out the true alignment rules. I was
> > > >> hoping we could avoid that.)
> > > >
> > > > It would be nice if we had this information exported somehere so that we
> > > > do not have to rely on per-architecture ifdefs.
> > > >
> > > > What about adding MapAligment or something similar to the /proc/meminfo?
> > > >
> > >
> > > What's the use case you envision for that? I don't see how that would be
> > > better than using SHMLBA, which is available at compiler time. Because
> > > unless someone expects to be able to run an app that was compiled for
> > > Arch X, on Arch Y (surely that's not requirement here?), I don't see how
> > > the run-time check is any better.
> >
> > I guess that some kind of compile time constant in uapi headers will do
> > as well, I'm really open to any solution that would expose this constant
> > as some kind of official API.
>
> I am not sure this is really feasible. It is not only a simple alignment
> thing. Look at ppc for example (slice_get_unmapped_area). Other
> architectures might have even more complicated rules e.g. arm and its
> cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
> backed mappings.
>
> I would really leave dogs sleeping... Trying to document all this in the
> man page has chances to confuse more people than it has chances to help
> those who already know all these nasty details.
You don't have to provide all the details, but warning that there's arch-
specific magic would be nice...
Pavel
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On 12/09/2017 09:19 AM, Pavel Machek wrote:
> On Thu 2017-12-07 15:02:21, Michal Hocko wrote:
>> On Thu 07-12-17 13:58:05, Cyril Hrubis wrote:
>>> Hi!
>>>>>> (It does seem unfortunate that the man page cannot help the programmer
>>>>>> actually write correct code here. He or she is forced to read the kernel
>>>>>> implementation, in order to figure out the true alignment rules. I was
>>>>>> hoping we could avoid that.)
>>>>>
>>>>> It would be nice if we had this information exported somehere so that we
>>>>> do not have to rely on per-architecture ifdefs.
>>>>>
>>>>> What about adding MapAligment or something similar to the /proc/meminfo?
>>>>>
>>>>
>>>> What's the use case you envision for that? I don't see how that would be
>>>> better than using SHMLBA, which is available at compiler time. Because
>>>> unless someone expects to be able to run an app that was compiled for
>>>> Arch X, on Arch Y (surely that's not requirement here?), I don't see how
>>>> the run-time check is any better.
>>>
>>> I guess that some kind of compile time constant in uapi headers will do
>>> as well, I'm really open to any solution that would expose this constant
>>> as some kind of official API.
>>
>> I am not sure this is really feasible. It is not only a simple alignment
>> thing. Look at ppc for example (slice_get_unmapped_area). Other
>> architectures might have even more complicated rules e.g. arm and its
>> cache_is_vipt_aliasing. Also this applies only on MAP_SHARED || file
>> backed mappings.
>>
>> I would really leave dogs sleeping... Trying to document all this in the
>> man page has chances to confuse more people than it has chances to help
>> those who already know all these nasty details.
>
> You don't have to provide all the details, but warning that there's arch-
> specific magic would be nice...
Hi Pavel,
In version 4 of this patch (which oddly enough, I have trouble finding via
google, it only seems to show up in patchwork.kernel.org [1]), I phrased it
like this:
Don't interpret addr as a hint: place the mapping at exactly that
address. addr must be suitably aligned: for most architectures a
multiple of page size is sufficient; however, some architectures
may impose additional restrictions.
...which is basically what Cyril was asking for, in his early feedback.
Does that work for you?
(Maybe I need to repost that patch. In any case the CC's need updating,
at least.)
[1] https://patchwork.kernel.org/patch/10094905/
thanks,
--
John Hubbard
NVIDIA
> Pavel
>
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>