2004-04-16 04:17:37

by David Gibson

[permalink] [raw]
Subject: Fix bogus get_page() calls in hugepage code

Andrew, please apply:

On some archs the functions used to implement follow_page() for
hugepages do a get_page(). This is unlike the normal-page path for
follow_page(), so presumably a bug. This patch fixes it.


Index: working-2.6/arch/ppc64/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/ppc64/mm/hugetlbpage.c 2004-04-13 11:42:35.000000000 +1000
+++ working-2.6/arch/ppc64/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -360,10 +360,8 @@
BUG_ON(! pmd_hugepage(*pmd));

page = hugepte_page(*(hugepte_t *)pmd);
- if (page) {
+ if (page)
page += ((address & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
- }
return page;
}

Index: working-2.6/arch/i386/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/i386/mm/hugetlbpage.c 2004-04-13 11:42:35.000000000 +1000
+++ working-2.6/arch/i386/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -206,10 +206,8 @@
struct page *page;

page = pte_page(*(pte_t *)pmd);
- if (page) {
+ if (page)
page += ((address & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
- }
return page;
}
#endif
Index: working-2.6/arch/ia64/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/ia64/mm/hugetlbpage.c 2004-04-14 12:22:48.000000000 +1000
+++ working-2.6/arch/ia64/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -170,7 +170,6 @@
ptep = huge_pte_offset(mm, addr);
page = pte_page(*ptep);
page += ((addr & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
return page;
}
int pmd_huge(pmd_t pmd)

--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson


2004-04-16 04:30:36

by Andrew Morton

[permalink] [raw]
Subject: Re: Fix bogus get_page() calls in hugepage code

David Gibson <[email protected]> wrote:
>
> On some archs the functions used to implement follow_page() for
> hugepages do a get_page(). This is unlike the normal-page path for
> follow_page(), so presumably a bug. This patch fixes it.

get_user_pages() is supposed to pin the pages which it placed into the
callers pages[] array.

And the caller of get_user_pages() is supposed to unpin those pages when
they are finished with.

So follow_hugetlb_page() is currently doing the right thing. The asymmetry
with follow_page() is awkward, but the overall intent was to minimise the
amount of impact which the hugepage code has on core MM.


Aside: note that get_user_pages() doesn't hold page_table_lock while
walking the pagetables, whereas it does hold that lock while walking the
regular pages' pagetables. This is because the caller of get_user_pages()
holds down_read(mmap_sem), whereas hugetlb pagetable setup and teardown
always happens under down_write(mmap_sem).

2004-04-16 04:39:27

by David Gibson

[permalink] [raw]
Subject: Re: Fix bogus get_page() calls in hugepage code

On Thu, Apr 15, 2004 at 09:30:11PM -0700, Andrew Morton wrote:
> David Gibson <[email protected]> wrote:
> >
> > On some archs the functions used to implement follow_page() for
> > hugepages do a get_page(). This is unlike the normal-page path for
> > follow_page(), so presumably a bug. This patch fixes it.
>
> get_user_pages() is supposed to pin the pages which it placed into the
> callers pages[] array.
>
> And the caller of get_user_pages() is supposed to unpin those pages when
> they are finished with.
>
> So follow_hugetlb_page() is currently doing the right thing. The asymmetry
> with follow_page() is awkward, but the overall intent was to minimise the
> amount of impact which the hugepage code has on core MM.

Yes, but I'm not talking about follow_hugetlb_page() (my patch doesn't
touch it). I'm talking about follow_huge_addr() and
follow_huge_pmd(). These are called to implement follow_page(). In
get_user_pages(), follow_pages() is bypassed bu the call to
follow_hugetlb_page(), but these extra get_pages() are presumably a
bug if follow_page() is called on a hugepage from somewhere other than
get_user_pages().

So I think my patch is correct.

> Aside: note that get_user_pages() doesn't hold page_table_lock while
> walking the pagetables, whereas it does hold that lock while walking the
> regular pages' pagetables. This is because the caller of get_user_pages()
> holds down_read(mmap_sem), whereas hugetlb pagetable setup and teardown
> always happens under down_write(mmap_sem).

Ah, that's useful to know.

--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson

2004-04-16 04:49:30

by Andrew Morton

[permalink] [raw]
Subject: Re: Fix bogus get_page() calls in hugepage code

David Gibson <[email protected]> wrote:
>
> On Thu, Apr 15, 2004 at 09:30:11PM -0700, Andrew Morton wrote:
> > David Gibson <[email protected]> wrote:
> > >
> > > On some archs the functions used to implement follow_page() for
> > > hugepages do a get_page(). This is unlike the normal-page path for
> > > follow_page(), so presumably a bug. This patch fixes it.
> >
> > get_user_pages() is supposed to pin the pages which it placed into the
> > callers pages[] array.
> >
> > And the caller of get_user_pages() is supposed to unpin those pages when
> > they are finished with.
> >
> > So follow_hugetlb_page() is currently doing the right thing. The asymmetry
> > with follow_page() is awkward, but the overall intent was to minimise the
> > amount of impact which the hugepage code has on core MM.
>
> Yes, but I'm not talking about follow_hugetlb_page() (my patch doesn't
> touch it). I'm talking about follow_huge_addr() and
> follow_huge_pmd(). These are called to implement follow_page(). In
> get_user_pages(), follow_pages() is bypassed bu the call to
> follow_hugetlb_page(), but these extra get_pages() are presumably a
> bug if follow_page() is called on a hugepage from somewhere other than
> get_user_pages().

Oh, OK. IIRC that stuff was added to support futexes-in-large-pages. The
caller holds ->page_table_lock, as does zap_hugepage_range(), so that seems
fine.

> So I think my patch is correct.

yup, thanks.