2007-01-25 21:43:01

by Adam Litke

[permalink] [raw]
Subject: [PATCH] Don't allow the stack to grow into hugetlb reserved regions


When expanding the stack, we don't currently check if the VMA will cross into
an area of the address space that is reserved for hugetlb pages. Subsequent
faults on the expanded portion of such a VMA will confuse the low-level MMU
code, resulting in an OOPS. Check for this.

Signed-off-by: Adam Litke <[email protected]>
---

mm/mmap.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 9717337..2c6b163 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1477,6 +1477,7 @@ static int acct_stack_growth(struct vm_area_struct * vma, unsigned long size, un
{
struct mm_struct *mm = vma->vm_mm;
struct rlimit *rlim = current->signal->rlim;
+ unsigned long new_start;

/* address space limit tests */
if (!may_expand_vm(mm, grow))
@@ -1496,6 +1497,12 @@ static int acct_stack_growth(struct vm_area_struct * vma, unsigned long size, un
return -ENOMEM;
}

+ /* Check to make the stack will not grow into a hugetlb-only region. */
+ new_start = (vma->vm_flags & VM_GROWSUP) ? vma->vm_start :
+ vma->vm_end - size;
+ if (is_hugepage_only_range(vma->vm_mm, new_start, size))
+ return -EFAULT;
+
/*
* Overcommit.. This must be the final test, as it will
* update security statistics.


2007-01-26 20:06:12

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On Thu, 25 Jan 2007 13:40:52 -0800
Adam Litke <[email protected]> wrote:

> When expanding the stack, we don't currently check if the VMA will cross into
> an area of the address space that is reserved for hugetlb pages. Subsequent
> faults on the expanded portion of such a VMA will confuse the low-level MMU
> code, resulting in an OOPS. Check for this.

We prefer not to oops. Is there any reason why this isn't a serious fix, needed
in 2.6.20 and 2.6.19?

2007-01-26 21:02:41

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On Thu, 25 Jan 2007, Adam Litke wrote:

> When expanding the stack, we don't currently check if the VMA will cross into
> an area of the address space that is reserved for hugetlb pages. Subsequent
> faults on the expanded portion of such a VMA will confuse the low-level MMU
> code, resulting in an OOPS. Check for this.

Good point, and your patch looks good; but a couple of reservations.

Trivial: I'd have preferred your error return to be -ENOMEM like all
the rest there, I can't see a good reason for this one to say -EFAULT.
Though apparently nothing cares just so long as it's non-zero.

Less trivial (and I wonder whether you've come to this from an ia64
or a powerpc direction): I notice that ia64 has more stringent REGION
checks in its ia64_do_page_fault, before calling expand_stack or
expand_upwards. So on that path, the usual path, I think your
new check in acct_stack_growth is unnecessary on ia64; whereas
coming from the get_user_pages find_extend_vma direction,
your check is inadequate?

I'd dearly like to believe that get_user_pages shouldn't be needing
find_extend_vma (it ends up using the wrong task's rlimits); but last
time I researched that, it looked very deliberate, that ptrace be
allowed to expand the stack - I think there's something out there
which would break if that weren't allowed, but I don't know what.

Hugh

>
> Signed-off-by: Adam Litke <[email protected]>
> ---
>
> mm/mmap.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9717337..2c6b163 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1477,6 +1477,7 @@ static int acct_stack_growth(struct vm_area_struct * vma, unsigned long size, un
> {
> struct mm_struct *mm = vma->vm_mm;
> struct rlimit *rlim = current->signal->rlim;
> + unsigned long new_start;
>
> /* address space limit tests */
> if (!may_expand_vm(mm, grow))
> @@ -1496,6 +1497,12 @@ static int acct_stack_growth(struct vm_area_struct * vma, unsigned long size, un
> return -ENOMEM;
> }
>
> + /* Check to make the stack will not grow into a hugetlb-only region. */
> + new_start = (vma->vm_flags & VM_GROWSUP) ? vma->vm_start :
> + vma->vm_end - size;
> + if (is_hugepage_only_range(vma->vm_mm, new_start, size))
> + return -EFAULT;
> +
> /*
> * Overcommit.. This must be the final test, as it will
> * update security statistics.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2007-01-26 22:49:07

by Ken Chen

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On 1/26/07, Hugh Dickins <[email protected]> wrote:
> Less trivial (and I wonder whether you've come to this from an ia64
> or a powerpc direction): I notice that ia64 has more stringent REGION
> checks in its ia64_do_page_fault, before calling expand_stack or
> expand_upwards. So on that path, the usual path, I think your
> new check in acct_stack_growth is unnecessary on ia64;

I think you are correct. This appears to affect powerpc only. On ia64,
hugetlb lives in a completely different region and they can never step
into normal stack address space. And for x86, there isn't a thing called
"reserved address space" for hugetlb mapping.

- Ken

2007-01-27 09:08:51

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On Fri, 26 Jan 2007, Ken Chen wrote:
> On 1/26/07, Hugh Dickins <[email protected]> wrote:
> > Less trivial (and I wonder whether you've come to this from an ia64
> > or a powerpc direction): I notice that ia64 has more stringent REGION
> > checks in its ia64_do_page_fault, before calling expand_stack or
> > expand_upwards. So on that path, the usual path, I think your
> > new check in acct_stack_growth is unnecessary on ia64;
>
> I think you are correct. This appears to affect powerpc only. On ia64,
> hugetlb lives in a completely different region and they can never step
> into normal stack address space. And for x86, there isn't a thing called
> "reserved address space" for hugetlb mapping.

Thanks, that's reassuring for the hugetlb case, and therefore Adam's
patch should not be delayed. But it does leave open the question I
was raising in the text you've snipped: if ia64 needs those stringent
REGION checks in its ia64_do_page_fault path, don't we need to add
them some(messy)how in the get_user_pages find_extend_vma path?

Hugh

2007-01-28 20:28:07

by Ken Chen

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On 1/27/07, Hugh Dickins <[email protected]> wrote:
> Thanks, that's reassuring for the hugetlb case, and therefore Adam's
> patch should not be delayed. But it does leave open the question I
> was raising in the text you've snipped: if ia64 needs those stringent
> REGION checks in its ia64_do_page_fault path, don't we need to add
> them some(messy)how in the get_user_pages find_extend_vma path?

I left it out because I need more time to digest what you said. After
looked through ia64's page fault and get_user_pages, I've concluded
that the bug scenario Adam described is impossible to trigger on ia64
due to various constrains and how the virtual address is laid out.

For ia64, the hugetlb address region is reserved at the top of user
space address. Stacks are below that region. Throw in the mix, we
have two stacks, one memory stack that grows down and one register
stack backing store that grows up. These two stacks are always in
pair and grow towards each other. And lastly, we have virtual address
holes in between regions. It's just impossible to grow any of these
two stacks into hugetlb region no matter how I played it.

So, AFAICS this bug doesn't apply to ia64 (and certainly not x86). The
new check of is_hugepage_only_range() is really a noop for both arches.

- Ken

2007-01-29 17:26:51

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On Sun, 28 Jan 2007, Ken Chen wrote:
>
> For ia64, the hugetlb address region is reserved at the top of user
> space address. Stacks are below that region. Throw in the mix, we
> have two stacks, one memory stack that grows down and one register
> stack backing store that grows up. These two stacks are always in
> pair and grow towards each other. And lastly, we have virtual address
> holes in between regions. It's just impossible to grow any of these
> two stacks into hugetlb region no matter how I played it.
>
> So, AFAICS this bug doesn't apply to ia64 (and certainly not x86). The
> new check of is_hugepage_only_range() is really a noop for both arches.

Certainly not a problem on x86.

But, never mind hugetlb, you still not quite convinced me that there's
no problem at all with get_user_pages find_extend_vma growing on ia64.

I repeat that ia64_do_page_fault has REGION tests to guard against
expanding either kind of stack across into another region. ia64_brk,
ia64_mmap_check and arch_get_unmapped_area have RGN_MAP_LIMIT checks.
But where is the equivalent paranoia when ptrace calls get_user_pages
calls find_extend_vma?

If your usual stacks face each other across the same region, they're
not going to pose problem. But what if someone mmaps MAP_GROWSDOWN
near the base of a region, then uses ptrace to touch an address near
the top of the region below?

Hugh

2007-01-29 18:32:55

by Ken Chen

[permalink] [raw]
Subject: Re: [PATCH] Don't allow the stack to grow into hugetlb reserved regions

On 1/29/07, Hugh Dickins <[email protected]> wrote:
> But, never mind hugetlb, you still not quite convinced me that there's
> no problem at all with get_user_pages find_extend_vma growing on ia64.
>
> I repeat that ia64_do_page_fault has REGION tests to guard against
> expanding either kind of stack across into another region. ia64_brk,
> ia64_mmap_check and arch_get_unmapped_area have RGN_MAP_LIMIT checks.
> But where is the equivalent paranoia when ptrace calls get_user_pages
> calls find_extend_vma?
>
> If your usual stacks face each other across the same region, they're
> not going to pose problem. But what if someone mmaps MAP_GROWSDOWN
> near the base of a region, then uses ptrace to touch an address near
> the top of the region below?

OK, now I fully understand what you are after. I kept on thinking in the
context of hugetlb. You are correct that ia64 does not have proper address
check for find_extend_vma() and it is indeed a potentially very bad bug in
there. I'm with you, I don't see the equivalent RGN_MAP_LIMIT check in the
get_user_pages() path.

Forwarding this to Tony as I don't have any access to ia64 machine anymore
to test/validate a fix.

- Ken