Andrew, please apply:
On some archs the functions used to implement follow_page() for
hugepages do a get_page(). This is unlike the normal-page path for
follow_page(), so presumably a bug. This patch fixes it.
Index: working-2.6/arch/ppc64/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/ppc64/mm/hugetlbpage.c 2004-04-13 11:42:35.000000000 +1000
+++ working-2.6/arch/ppc64/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -360,10 +360,8 @@
BUG_ON(! pmd_hugepage(*pmd));
page = hugepte_page(*(hugepte_t *)pmd);
- if (page) {
+ if (page)
page += ((address & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
- }
return page;
}
Index: working-2.6/arch/i386/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/i386/mm/hugetlbpage.c 2004-04-13 11:42:35.000000000 +1000
+++ working-2.6/arch/i386/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -206,10 +206,8 @@
struct page *page;
page = pte_page(*(pte_t *)pmd);
- if (page) {
+ if (page)
page += ((address & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
- }
return page;
}
#endif
Index: working-2.6/arch/ia64/mm/hugetlbpage.c
===================================================================
--- working-2.6.orig/arch/ia64/mm/hugetlbpage.c 2004-04-14 12:22:48.000000000 +1000
+++ working-2.6/arch/ia64/mm/hugetlbpage.c 2004-04-16 13:45:22.000000000 +1000
@@ -170,7 +170,6 @@
ptep = huge_pte_offset(mm, addr);
page = pte_page(*ptep);
page += ((addr & ~HPAGE_MASK) >> PAGE_SHIFT);
- get_page(page);
return page;
}
int pmd_huge(pmd_t pmd)
--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
David Gibson <[email protected]> wrote:
>
> On some archs the functions used to implement follow_page() for
> hugepages do a get_page(). This is unlike the normal-page path for
> follow_page(), so presumably a bug. This patch fixes it.
get_user_pages() is supposed to pin the pages which it placed into the
callers pages[] array.
And the caller of get_user_pages() is supposed to unpin those pages when
they are finished with.
So follow_hugetlb_page() is currently doing the right thing. The asymmetry
with follow_page() is awkward, but the overall intent was to minimise the
amount of impact which the hugepage code has on core MM.
Aside: note that get_user_pages() doesn't hold page_table_lock while
walking the pagetables, whereas it does hold that lock while walking the
regular pages' pagetables. This is because the caller of get_user_pages()
holds down_read(mmap_sem), whereas hugetlb pagetable setup and teardown
always happens under down_write(mmap_sem).
On Thu, Apr 15, 2004 at 09:30:11PM -0700, Andrew Morton wrote:
> David Gibson <[email protected]> wrote:
> >
> > On some archs the functions used to implement follow_page() for
> > hugepages do a get_page(). This is unlike the normal-page path for
> > follow_page(), so presumably a bug. This patch fixes it.
>
> get_user_pages() is supposed to pin the pages which it placed into the
> callers pages[] array.
>
> And the caller of get_user_pages() is supposed to unpin those pages when
> they are finished with.
>
> So follow_hugetlb_page() is currently doing the right thing. The asymmetry
> with follow_page() is awkward, but the overall intent was to minimise the
> amount of impact which the hugepage code has on core MM.
Yes, but I'm not talking about follow_hugetlb_page() (my patch doesn't
touch it). I'm talking about follow_huge_addr() and
follow_huge_pmd(). These are called to implement follow_page(). In
get_user_pages(), follow_pages() is bypassed bu the call to
follow_hugetlb_page(), but these extra get_pages() are presumably a
bug if follow_page() is called on a hugepage from somewhere other than
get_user_pages().
So I think my patch is correct.
> Aside: note that get_user_pages() doesn't hold page_table_lock while
> walking the pagetables, whereas it does hold that lock while walking the
> regular pages' pagetables. This is because the caller of get_user_pages()
> holds down_read(mmap_sem), whereas hugetlb pagetable setup and teardown
> always happens under down_write(mmap_sem).
Ah, that's useful to know.
--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
David Gibson <[email protected]> wrote:
>
> On Thu, Apr 15, 2004 at 09:30:11PM -0700, Andrew Morton wrote:
> > David Gibson <[email protected]> wrote:
> > >
> > > On some archs the functions used to implement follow_page() for
> > > hugepages do a get_page(). This is unlike the normal-page path for
> > > follow_page(), so presumably a bug. This patch fixes it.
> >
> > get_user_pages() is supposed to pin the pages which it placed into the
> > callers pages[] array.
> >
> > And the caller of get_user_pages() is supposed to unpin those pages when
> > they are finished with.
> >
> > So follow_hugetlb_page() is currently doing the right thing. The asymmetry
> > with follow_page() is awkward, but the overall intent was to minimise the
> > amount of impact which the hugepage code has on core MM.
>
> Yes, but I'm not talking about follow_hugetlb_page() (my patch doesn't
> touch it). I'm talking about follow_huge_addr() and
> follow_huge_pmd(). These are called to implement follow_page(). In
> get_user_pages(), follow_pages() is bypassed bu the call to
> follow_hugetlb_page(), but these extra get_pages() are presumably a
> bug if follow_page() is called on a hugepage from somewhere other than
> get_user_pages().
Oh, OK. IIRC that stuff was added to support futexes-in-large-pages. The
caller holds ->page_table_lock, as does zap_hugepage_range(), so that seems
fine.
> So I think my patch is correct.
yup, thanks.