2015-11-16 17:37:24

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

When a new virtual memory area is added to the process's virtual address
space and this vma causes the process's virtual address space limit
(RLIMIT_AS) to be exceeded then kernel behaves incorrectly. Incorrect
behavior is a result of a kernel bug. The kernel in most cases
unnecessarily scans the entire process's virtual address space trying to
find the overlapping vma with the virtual memory region being added.
The kernel incorrectly compares the MAP_FIXED flag with vm_flags variable
in mmap_region function. The vm_flags variable should not be compared
with MAP_FIXED flag. The MAP_FIXED flag has got the same numerical value
as VM_MAYREAD flag (0x10). As a result the following test
from mmap_region:

if (!(vm_flags & MAP_FIXED))
is in fact:
if (!(vm_flags & VM_MAYREAD))

The VM_MAYREAD flag is almost always set in vm_flags while MAP_FIXED
flag is not so common. The result of the above condition is somewhat
reverted.
This patch fixes this bug. It causes that the kernel tries to find the
overlapping vma only when the requested virtual memory region has got
the fixed starting virtual address (MAP_FIXED flag set).
For tile architecture Calling mmap_region with the MAP_FIXED flag only is
sufficient. However the MAP_ANONYMOUS and MAP_PRIVATE flags are passed for
the completeness of the solution.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
arch/tile/mm/elf.c | 1 +
include/linux/mm.h | 3 ++-
mm/mmap.c | 7 ++++---
3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
index 6225cc9..dae4b33 100644
--- a/arch/tile/mm/elf.c
+++ b/arch/tile/mm/elf.c
@@ -142,6 +142,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
if (!retval) {
unsigned long addr = MEM_USER_INTRPT;
addr = mmap_region(NULL, addr, INTRPT_SIZE,
+ MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE,
VM_READ|VM_EXEC|
VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, 0);
if (addr > (unsigned long) -PAGE_SIZE)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 00bad77..1ae21c1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1911,7 +1911,8 @@ extern int install_special_mapping(struct mm_struct *mm,
extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);

extern unsigned long mmap_region(struct file *file, unsigned long addr,
- unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
+ unsigned long len, unsigned long flags,
+ vm_flags_t vm_flags, unsigned long pgoff);
extern unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot, unsigned long flags,
vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a6..ad8b845 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1399,7 +1399,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
vm_flags |= VM_NORESERVE;
}

- addr = mmap_region(file, addr, len, vm_flags, pgoff);
+ addr = mmap_region(file, addr, len, flags, vm_flags, pgoff);
if (!IS_ERR_VALUE(addr) &&
((vm_flags & VM_LOCKED) ||
(flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
@@ -1535,7 +1535,8 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
}

unsigned long mmap_region(struct file *file, unsigned long addr,
- unsigned long len, vm_flags_t vm_flags, unsigned long pgoff)
+ unsigned long len, unsigned long flags,
+ vm_flags_t vm_flags, unsigned long pgoff)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma, *prev;
@@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* MAP_FIXED may remove pages of mappings that intersects with
* requested mapping. Account for the pages it would unmap.
*/
- if (!(vm_flags & MAP_FIXED))
+ if (!(flags & MAP_FIXED))
return -ENOMEM;

nr_pages = count_vma_pages_range(mm, addr, addr + len);
--
2.6.2


2015-11-16 20:52:15

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

[CCing Cyril]

On Mon 16-11-15 18:36:19, Piotr Kwapulinski wrote:
> When a new virtual memory area is added to the process's virtual address
> space and this vma causes the process's virtual address space limit
> (RLIMIT_AS) to be exceeded then kernel behaves incorrectly. Incorrect
> behavior is a result of a kernel bug. The kernel in most cases
> unnecessarily scans the entire process's virtual address space trying to
> find the overlapping vma with the virtual memory region being added.
> The kernel incorrectly compares the MAP_FIXED flag with vm_flags variable
> in mmap_region function. The vm_flags variable should not be compared
> with MAP_FIXED flag. The MAP_FIXED flag has got the same numerical value
> as VM_MAYREAD flag (0x10). As a result the following test
> from mmap_region:
>
> if (!(vm_flags & MAP_FIXED))
> is in fact:
> if (!(vm_flags & VM_MAYREAD))

It seems this has been broken since it was introduced e8420a8ece80
("mm/mmap: check for RLIMIT_AS before unmapping"). The patch has fixed
the case where unmap happened before the we tested may_expand_vm and
failed which was sufficient for the LTP test case but it apparently
never tried to consider the overlapping ranges to eventually allow to
create the mapping.

> The VM_MAYREAD flag is almost always set in vm_flags while MAP_FIXED
> flag is not so common. The result of the above condition is somewhat
> reverted.
> This patch fixes this bug. It causes that the kernel tries to find the
> overlapping vma only when the requested virtual memory region has got
> the fixed starting virtual address (MAP_FIXED flag set).
> For tile architecture Calling mmap_region with the MAP_FIXED flag only is
> sufficient. However the MAP_ANONYMOUS and MAP_PRIVATE flags are passed for
> the completeness of the solution.

I found the changelog rather hard to grasp. But the fix is correct. The
user visible effect is that RLIMIT_AS might hit for a MAP_FIXED
mapping even though it wouldn't enlarge the used address space.

> Signed-off-by: Piotr Kwapulinski <[email protected]>

Acked-by: Michal Hocko <[email protected]>

but I would encourage you to rephrase the changelog. Something like the
following maybe? Also I would rename flags to mmap_flags to be more
clear.
"
e8420a8ece80 ("mm/mmap: check for RLIMIT_AS before unmapping") has tried
to make MAP_FIXED bahavior wrt RLIMIT_AS more graceful. One of the issue
was that the original mapping shouldn't be destroyed if the mmap call
had to fail due to the address limit. Another one was that MAP_FIXED is
destroying an existing mapping and so might not enlarge the used address
space in the end and then the mmap should suceed. The later case is
implemented by checking count_vma_pages_range but it never really worked
because MAP_FIXED flag is compared against vm_flags rather than mmap
flags. This means that the overlapping areas were never checked because
MAP_FIXED has the same value as VM_MAYREAD which is basically always set.

Fix the issue by introducing a new parameter to mmap_region which forwards
the mmap flags and now the MAP_FIXED can be checked properly.
Tile and its arch_setup_additional_pages as the only in tree user of
mmap_region has to be specific about its mmap flags now.
"

I guess this is also a stable material.

> ---
> arch/tile/mm/elf.c | 1 +
> include/linux/mm.h | 3 ++-
> mm/mmap.c | 7 ++++---
> 3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
> index 6225cc9..dae4b33 100644
> --- a/arch/tile/mm/elf.c
> +++ b/arch/tile/mm/elf.c
> @@ -142,6 +142,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
> if (!retval) {
> unsigned long addr = MEM_USER_INTRPT;
> addr = mmap_region(NULL, addr, INTRPT_SIZE,
> + MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE,
> VM_READ|VM_EXEC|
> VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, 0);
> if (addr > (unsigned long) -PAGE_SIZE)
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 00bad77..1ae21c1 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1911,7 +1911,8 @@ extern int install_special_mapping(struct mm_struct *mm,
> extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
>
> extern unsigned long mmap_region(struct file *file, unsigned long addr,
> - unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
> + unsigned long len, unsigned long flags,
> + vm_flags_t vm_flags, unsigned long pgoff);
> extern unsigned long do_mmap(struct file *file, unsigned long addr,
> unsigned long len, unsigned long prot, unsigned long flags,
> vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a6..ad8b845 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1399,7 +1399,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> vm_flags |= VM_NORESERVE;
> }
>
> - addr = mmap_region(file, addr, len, vm_flags, pgoff);
> + addr = mmap_region(file, addr, len, flags, vm_flags, pgoff);
> if (!IS_ERR_VALUE(addr) &&
> ((vm_flags & VM_LOCKED) ||
> (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
> @@ -1535,7 +1535,8 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
> }
>
> unsigned long mmap_region(struct file *file, unsigned long addr,
> - unsigned long len, vm_flags_t vm_flags, unsigned long pgoff)
> + unsigned long len, unsigned long flags,
> + vm_flags_t vm_flags, unsigned long pgoff)
> {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma, *prev;
> @@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> * MAP_FIXED may remove pages of mappings that intersects with
> * requested mapping. Account for the pages it would unmap.
> */
> - if (!(vm_flags & MAP_FIXED))
> + if (!(flags & MAP_FIXED))
> return -ENOMEM;
>
> nr_pages = count_vma_pages_range(mm, addr, addr + len);
> --
> 2.6.2
>

--
Michal Hocko
SUSE Labs

2015-11-17 00:33:28

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

On Mon 16-11-15 21:52:10, Michal Hocko wrote:
> [CCing Cyril]
>
> On Mon 16-11-15 18:36:19, Piotr Kwapulinski wrote:
> > When a new virtual memory area is added to the process's virtual address
> > space and this vma causes the process's virtual address space limit
> > (RLIMIT_AS) to be exceeded then kernel behaves incorrectly. Incorrect
> > behavior is a result of a kernel bug. The kernel in most cases
> > unnecessarily scans the entire process's virtual address space trying to
> > find the overlapping vma with the virtual memory region being added.
> > The kernel incorrectly compares the MAP_FIXED flag with vm_flags variable
> > in mmap_region function. The vm_flags variable should not be compared
> > with MAP_FIXED flag. The MAP_FIXED flag has got the same numerical value
> > as VM_MAYREAD flag (0x10). As a result the following test
> > from mmap_region:
> >
> > if (!(vm_flags & MAP_FIXED))
> > is in fact:
> > if (!(vm_flags & VM_MAYREAD))
>
> It seems this has been broken since it was introduced e8420a8ece80
> ("mm/mmap: check for RLIMIT_AS before unmapping"). The patch has fixed
> the case where unmap happened before the we tested may_expand_vm and
> failed which was sufficient for the LTP test case but it apparently
> never tried to consider the overlapping ranges to eventually allow to
> create the mapping.
>
> > The VM_MAYREAD flag is almost always set in vm_flags while MAP_FIXED
> > flag is not so common. The result of the above condition is somewhat
> > reverted.
> > This patch fixes this bug. It causes that the kernel tries to find the
> > overlapping vma only when the requested virtual memory region has got
> > the fixed starting virtual address (MAP_FIXED flag set).
> > For tile architecture Calling mmap_region with the MAP_FIXED flag only is
> > sufficient. However the MAP_ANONYMOUS and MAP_PRIVATE flags are passed for
> > the completeness of the solution.
>
> I found the changelog rather hard to grasp. But the fix is correct. The
> user visible effect is that RLIMIT_AS might hit for a MAP_FIXED
> mapping even though it wouldn't enlarge the used address space.

And I was obviously blind and read the condition incorrectly. Sigh...
There is no real issue in fact. We do call count_vma_pages_range even
for !MAP_FIXED case but that is merely a pointless find_vma call but
nothing really earth shattering. So nothing for the stable tree and also
quite exaggerated to be called a bug.

I am sorry about the confusion.
--
Michal Hocko
SUSE Labs

2015-11-17 15:23:34

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

On 11/16, Piotr Kwapulinski wrote:
>
> @@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> * MAP_FIXED may remove pages of mappings that intersects with
> * requested mapping. Account for the pages it would unmap.
> */
> - if (!(vm_flags & MAP_FIXED))
> + if (!(flags & MAP_FIXED))
> return -ENOMEM;

Agree, "vm_flags & MAP_FIXED" makes no sense and just wrong...

Can't we simply remove this check? Afaics it only helps to avoid
count_vma_pages_range() in the unlikely case when may_expand_vm() fails.
And without MAP_FIXED count_vma_pages_range() should be cheap,
find_vma_intersection() should fail.

And afaics arch/tile/mm/elf.c can use do_mmap(MAP_FIXED ...) rather than
mmap_region(), it can be changed by a separate patch. In this case we can
unexport mmap_region().


OTOH, I won't insist, this patch looks fine to me.

Oleg.

2015-11-17 15:37:51

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

On 11/17, Oleg Nesterov wrote:
>
> On 11/16, Piotr Kwapulinski wrote:
> >
> > @@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> > * MAP_FIXED may remove pages of mappings that intersects with
> > * requested mapping. Account for the pages it would unmap.
> > */
> > - if (!(vm_flags & MAP_FIXED))
> > + if (!(flags & MAP_FIXED))
> > return -ENOMEM;
>
> Agree, "vm_flags & MAP_FIXED" makes no sense and just wrong...
>
> Can't we simply remove this check? Afaics it only helps to avoid
> count_vma_pages_range() in the unlikely case when may_expand_vm() fails.
> And without MAP_FIXED count_vma_pages_range() should be cheap,
> find_vma_intersection() should fail.

Or we can simply move this may_expand_vm() block to the caller, do_mmap().

> And afaics arch/tile/mm/elf.c can use do_mmap(MAP_FIXED ...) rather than
> mmap_region(), it can be changed by a separate patch. In this case we can
> unexport mmap_region().
>
> OTOH, I won't insist, this patch looks fine to me.

Yes, but what I actually tried to say is that it would be nice to unexport
mmap_region(), arch/tile is the only caller outside of mmap.c.

Oleg.

2015-11-17 17:27:21

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

The following flag comparison in mmap_region is not fully correct:

if (!(vm_flags & MAP_FIXED))

The vm_flags should not be compared with MAP_FIXED (0x10). It is a bit
confusing. This condition is almost always true since VM_MAYREAD (0x10)
flag is almost always set by default. This patch removes this condition.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
mm/mmap.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a6..02422ea 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1547,13 +1547,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
unsigned long nr_pages;

- /*
- * MAP_FIXED may remove pages of mappings that intersects with
- * requested mapping. Account for the pages it would unmap.
- */
- if (!(vm_flags & MAP_FIXED))
- return -ENOMEM;
-
nr_pages = count_vma_pages_range(mm, addr, addr + len);

if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
--
2.6.2

2015-11-17 17:38:29

by Chris Metcalf

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

On 11/17/2015 11:19 AM, Oleg Nesterov wrote:
> On 11/16, Piotr Kwapulinski wrote:
>> @@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>> * MAP_FIXED may remove pages of mappings that intersects with
>> * requested mapping. Account for the pages it would unmap.
>> */
>> - if (!(vm_flags & MAP_FIXED))
>> + if (!(flags & MAP_FIXED))
>> return -ENOMEM;
> And afaics arch/tile/mm/elf.c can use do_mmap(MAP_FIXED ...) rather than
> mmap_region(), it can be changed by a separate patch. In this case we can
> unexport mmap_region().

The problem is that we are mapping a region of virtual address space that
the chip provides for setting up interrupt handlers (at 0xfc000000) but that
is above the TASK_SIZE cutoff, so do_mmap() would fail the call in
get_unmapped_area().

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

2015-11-17 18:07:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

On 11/17, Chris Metcalf wrote:
>
> On 11/17/2015 11:19 AM, Oleg Nesterov wrote:
>> On 11/16, Piotr Kwapulinski wrote:
>>> @@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>>> * MAP_FIXED may remove pages of mappings that intersects with
>>> * requested mapping. Account for the pages it would unmap.
>>> */
>>> - if (!(vm_flags & MAP_FIXED))
>>> + if (!(flags & MAP_FIXED))
>>> return -ENOMEM;
>> And afaics arch/tile/mm/elf.c can use do_mmap(MAP_FIXED ...) rather than
>> mmap_region(), it can be changed by a separate patch. In this case we can
>> unexport mmap_region().
>
> The problem is that we are mapping a region of virtual address space that
> the chip provides for setting up interrupt handlers (at 0xfc000000) but that
> is above the TASK_SIZE cutoff,

Ah, I didn't bother to read the comment in arch_setup_additional_pages().
Thanks for your explanation.

Oleg.

2015-11-18 00:52:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Tue, 17 Nov 2015 18:26:38 +0100 Piotr Kwapulinski <[email protected]> wrote:

> The following flag comparison in mmap_region is not fully correct:
>
> if (!(vm_flags & MAP_FIXED))
>
> The vm_flags should not be compared with MAP_FIXED (0x10). It is a bit
> confusing. This condition is almost always true since VM_MAYREAD (0x10)
> flag is almost always set by default. This patch removes this condition.
>
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1547,13 +1547,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
> unsigned long nr_pages;
>
> - /*
> - * MAP_FIXED may remove pages of mappings that intersects with
> - * requested mapping. Account for the pages it would unmap.
> - */
> - if (!(vm_flags & MAP_FIXED))
> - return -ENOMEM;
> -
> nr_pages = count_vma_pages_range(mm, addr, addr + len);
>
> if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))

That looks simpler.

However the changelog doesn't describe the end-user visible effects of
the bug, as changelogs should always do. Presumably this is causing
incorrect ENOMEM reporting due to RLIMIT_AS being exceeded, but this
isn't very specific.

So can you please fill in the details here? Such info is needed when
deciding which kernel version(s) need the fix.

Thanks.

2015-11-18 14:32:27

by Cyril Hrubis

[permalink] [raw]
Subject: Re: [PATCH] mm: fix incorrect behavior when process virtual address space limit is exceeded

Hi!
> [CCing Cyril]

Ah I've confused the vm_flags and flags variables and nobody caught that
during the review. But still sorry that I've messed up.

Looking at the code I agree with Michal that we try to find the
intesection poinlesly even for !MAP_FIXED which slowns down mmap() tiny
little bit which should be fixed.

The fix looks good to me.

--
Cyril Hrubis
[email protected]

2015-11-18 16:29:47

by Piotr Kwapulinski

[permalink] [raw]
Subject: Re: [PATCH] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Tue, Nov 17, 2015 at 04:52:51PM -0800, Andrew Morton wrote:
> On Tue, 17 Nov 2015 18:26:38 +0100 Piotr Kwapulinski <[email protected]> wrote:
>
> > The following flag comparison in mmap_region is not fully correct:
> >
> > if (!(vm_flags & MAP_FIXED))
> >
> > The vm_flags should not be compared with MAP_FIXED (0x10). It is a bit
> > confusing. This condition is almost always true since VM_MAYREAD (0x10)
> > flag is almost always set by default. This patch removes this condition.
> >
> > ...
> >
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1547,13 +1547,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> > if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
> > unsigned long nr_pages;
> >
> > - /*
> > - * MAP_FIXED may remove pages of mappings that intersects with
> > - * requested mapping. Account for the pages it would unmap.
> > - */
> > - if (!(vm_flags & MAP_FIXED))
> > - return -ENOMEM;
> > -
> > nr_pages = count_vma_pages_range(mm, addr, addr + len);
> >
> > if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
>
> That looks simpler.
>
> However the changelog doesn't describe the end-user visible effects of
> the bug, as changelogs should always do. Presumably this is causing
> incorrect ENOMEM reporting due to RLIMIT_AS being exceeded, but this
> isn't very specific.
>
> So can you please fill in the details here? Such info is needed when
> deciding which kernel version(s) need the fix.
>
> Thanks.

The first patch has got a user visible effect and it fixes the
real issue (corner case one). The second patch has no user visible effect.
It just removes the code that makes no sense. The second patch has
been created in case the first patch was not going to be accepted.
I will send both patches again to let you choose which one you preffer.
This time the patches will contain the more clear changelog containing
the user visible effect.

Thanks.
---
Piotr Kwapulinski

2015-11-20 16:39:53

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH v2 1/2] mm: fix incorrect behavior when process virtual address space limit is exceeded

The following flag comparison in mmap_region makes no sense:

if (!(vm_flags & MAP_FIXED))
return -ENOMEM;

The condition is false even if MAP_FIXED is not set what causes the
unnecessary find_vma call. MAP_FIXED has the same value as VM_MAYREAD.
The vm_flags must not be compared with MAP_FIXED. The vm_flags may only
be compared with VM_* flags. The mmap executes slightly longer when
MAP_FIXED is not set and RLIMIT_AS is exceeded.

Fix the issue by introducing a new parameter to mmap_region which forwards
the mmap flags and now the MAP_FIXED can be checked properly.
Tile and its arch_setup_additional_pages as the user of
mmap_region has to be specific about its mmap flags now.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
arch/tile/mm/elf.c | 1 +
include/linux/mm.h | 3 ++-
mm/mmap.c | 7 ++++---
3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
index 6225cc9..dae4b33 100644
--- a/arch/tile/mm/elf.c
+++ b/arch/tile/mm/elf.c
@@ -142,6 +142,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
if (!retval) {
unsigned long addr = MEM_USER_INTRPT;
addr = mmap_region(NULL, addr, INTRPT_SIZE,
+ MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE,
VM_READ|VM_EXEC|
VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, 0);
if (addr > (unsigned long) -PAGE_SIZE)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 00bad77..f1a203f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1911,7 +1911,8 @@ extern int install_special_mapping(struct mm_struct *mm,
extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);

extern unsigned long mmap_region(struct file *file, unsigned long addr,
- unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
+ unsigned long len, unsigned long mmap_flags,
+ vm_flags_t vm_flags, unsigned long pgoff);
extern unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot, unsigned long flags,
vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a6..8f3427f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1399,7 +1399,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
vm_flags |= VM_NORESERVE;
}

- addr = mmap_region(file, addr, len, vm_flags, pgoff);
+ addr = mmap_region(file, addr, len, flags, vm_flags, pgoff);
if (!IS_ERR_VALUE(addr) &&
((vm_flags & VM_LOCKED) ||
(flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
@@ -1535,7 +1535,8 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
}

unsigned long mmap_region(struct file *file, unsigned long addr,
- unsigned long len, vm_flags_t vm_flags, unsigned long pgoff)
+ unsigned long len, unsigned long mmap_flags,
+ vm_flags_t vm_flags, unsigned long pgoff)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma, *prev;
@@ -1551,7 +1552,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* MAP_FIXED may remove pages of mappings that intersects with
* requested mapping. Account for the pages it would unmap.
*/
- if (!(vm_flags & MAP_FIXED))
+ if (!(mmap_flags & MAP_FIXED))
return -ENOMEM;

nr_pages = count_vma_pages_range(mm, addr, addr + len);
--
2.6.2

2015-11-20 16:42:39

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH v2 2/2] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

The following flag comparison in mmap_region makes no sense:

if (!(vm_flags & MAP_FIXED))
return -ENOMEM;

The condition is always false and thus the above "return -ENOMEM" is never
executed. The vm_flags must not be compared with MAP_FIXED flag.
The vm_flags may only be compared with VM_* flags.
MAP_FIXED has the same value as VM_MAYREAD.
It has no user visible effect.

Remove the code that makes no sense.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
I made a mistake in a changelog in a previous version of this patch.
I'm Sorry for the confusion.
This patch may be considered to be applied only in case the patch
"[PATCH v2 1/2] mm: fix incorrect behavior when process virtual
address space limit is exceeded"
is not going to be accepted.

mm/mmap.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a6..42a8259 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* MAP_FIXED may remove pages of mappings that intersects with
* requested mapping. Account for the pages it would unmap.
*/
- if (!(vm_flags & MAP_FIXED))
- return -ENOMEM;
-
nr_pages = count_vma_pages_range(mm, addr, addr + len);

if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
--
2.6.2

2015-11-23 08:19:51

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Fri 20-11-15 17:42:14, Piotr Kwapulinski wrote:
> The following flag comparison in mmap_region makes no sense:
>
> if (!(vm_flags & MAP_FIXED))
> return -ENOMEM;
>
> The condition is always false and thus the above "return -ENOMEM" is never
> executed. The vm_flags must not be compared with MAP_FIXED flag.
> The vm_flags may only be compared with VM_* flags.
> MAP_FIXED has the same value as VM_MAYREAD.
> It has no user visible effect.
>
> Remove the code that makes no sense.
>
> Signed-off-by: Piotr Kwapulinski <[email protected]>

I think this is preferable. Hitting the rlimit is a slow path and
find_vma_intersection should realize that there is no overlapping
VMA for !MAP_FIXED case pretty quickly.

I would prefer this to be in the changelog rather than/in addition to
"It has no user visible effect" which is really vague.

Acked-by: Michal Hocko <[email protected]>

> ---
> I made a mistake in a changelog in a previous version of this patch.
> I'm Sorry for the confusion.
> This patch may be considered to be applied only in case the patch
> "[PATCH v2 1/2] mm: fix incorrect behavior when process virtual
> address space limit is exceeded"
> is not going to be accepted.
>
> mm/mmap.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a6..42a8259 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> * MAP_FIXED may remove pages of mappings that intersects with
> * requested mapping. Account for the pages it would unmap.
> */
> - if (!(vm_flags & MAP_FIXED))
> - return -ENOMEM;
> -
> nr_pages = count_vma_pages_range(mm, addr, addr + len);
>
> if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
> --
> 2.6.2
>

--
Michal Hocko
SUSE Labs

2015-11-23 17:37:24

by Piotr Kwapulinski

[permalink] [raw]
Subject: [PATCH v3] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

The following flag comparison in mmap_region makes no sense:

if (!(vm_flags & MAP_FIXED))
return -ENOMEM;

The condition is always false and thus the above "return -ENOMEM" is never
executed. The vm_flags must not be compared with MAP_FIXED flag.
The vm_flags may only be compared with VM_* flags.
MAP_FIXED has the same value as VM_MAYREAD.
Hitting the rlimit is a slow path and find_vma_intersection should realize
that there is no overlapping VMA for !MAP_FIXED case pretty quickly.

Remove the code that makes no sense.

Signed-off-by: Piotr Kwapulinski <[email protected]>
---
mm/mmap.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a6..42a8259 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* MAP_FIXED may remove pages of mappings that intersects with
* requested mapping. Account for the pages it would unmap.
*/
- if (!(vm_flags & MAP_FIXED))
- return -ENOMEM;
-
nr_pages = count_vma_pages_range(mm, addr, addr + len);

if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
--
2.6.2

2015-11-23 22:14:06

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v3] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Mon, 23 Nov 2015 18:36:42 +0100 Piotr Kwapulinski <[email protected]> wrote:

> The following flag comparison in mmap_region makes no sense:
>
> if (!(vm_flags & MAP_FIXED))
> return -ENOMEM;
>
> The condition is always false and thus the above "return -ENOMEM" is never
> executed. The vm_flags must not be compared with MAP_FIXED flag.
> The vm_flags may only be compared with VM_* flags.
> MAP_FIXED has the same value as VM_MAYREAD.
> Hitting the rlimit is a slow path and find_vma_intersection should realize
> that there is no overlapping VMA for !MAP_FIXED case pretty quickly.
>
> Remove the code that makes no sense.
>
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> * MAP_FIXED may remove pages of mappings that intersects with
> * requested mapping. Account for the pages it would unmap.
> */
> - if (!(vm_flags & MAP_FIXED))
> - return -ENOMEM;
> -
> nr_pages = count_vma_pages_range(mm, addr, addr + len);
>
> if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))

Did you intend to retain the stale comment?

2015-11-24 16:12:56

by Piotr Kwapulinski

[permalink] [raw]
Subject: Re: [PATCH v3] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Mon, Nov 23, 2015 at 02:14:01PM -0800, Andrew Morton wrote:
> On Mon, 23 Nov 2015 18:36:42 +0100 Piotr Kwapulinski <[email protected]> wrote:
>
> > The following flag comparison in mmap_region makes no sense:
> >
> > if (!(vm_flags & MAP_FIXED))
> > return -ENOMEM;
> >
> > The condition is always false and thus the above "return -ENOMEM" is never
> > executed. The vm_flags must not be compared with MAP_FIXED flag.
> > The vm_flags may only be compared with VM_* flags.
> > MAP_FIXED has the same value as VM_MAYREAD.
> > Hitting the rlimit is a slow path and find_vma_intersection should realize
> > that there is no overlapping VMA for !MAP_FIXED case pretty quickly.
> >
> > Remove the code that makes no sense.
> >
> > ...
> >
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> > * MAP_FIXED may remove pages of mappings that intersects with
> > * requested mapping. Account for the pages it would unmap.
> > */
> > - if (!(vm_flags & MAP_FIXED))
> > - return -ENOMEM;
> > -
> > nr_pages = count_vma_pages_range(mm, addr, addr + len);
> >
> > if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
>
> Did you intend to retain the stale comment?

It was my intention. This comment is still valid, even after removing the
condition.

2015-11-27 05:29:38

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v3] mm/mmap.c: remove incorrect MAP_FIXED flag comparison from mmap_region

On Mon, Nov 23, 2015 at 06:36:42PM +0100, Piotr Kwapulinski wrote:
> The following flag comparison in mmap_region makes no sense:
>
> if (!(vm_flags & MAP_FIXED))
> return -ENOMEM;
>
> The condition is always false and thus the above "return -ENOMEM" is never
> executed. The vm_flags must not be compared with MAP_FIXED flag.
> The vm_flags may only be compared with VM_* flags.
> MAP_FIXED has the same value as VM_MAYREAD.
> Hitting the rlimit is a slow path and find_vma_intersection should realize
> that there is no overlapping VMA for !MAP_FIXED case pretty quickly.
>
> Remove the code that makes no sense.
>
> Signed-off-by: Piotr Kwapulinski <[email protected]>

Looks good to me. Thank you.

Reviewed-by: Naoya Horiguchi <[email protected]>

> ---
> mm/mmap.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a6..42a8259 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1551,9 +1551,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> * MAP_FIXED may remove pages of mappings that intersects with
> * requested mapping. Account for the pages it would unmap.
> */
> - if (!(vm_flags & MAP_FIXED))
> - return -ENOMEM;
> -
> nr_pages = count_vma_pages_range(mm, addr, addr + len);
>
> if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
> --
> 2.6.2
> -