2024-03-23 03:33:24

by Peter Xu

[permalink] [raw]
Subject: [PATCH 0/2] mm: small fixup series for mm-unstable

From: Peter Xu <[email protected]>

Andrew,

This is the small series that should fixes the two reported issues on
mm-unstable for this series:

https://lore.kernel.org/r/[email protected]

This is build tested on (1) x86_64, allnoconfig+allmodconfig, (2) m68k,
allnoconfig+allmodconfig.

Please consider dropping below quickfix:

mm-gup-handle-hugepd-for-follow_page-fix

Then apply these two fixups.

Sorry this "small" fixup series is not that small. Said that, I tested
apply and the fixup should auto-squash all fine on current mm-unstable with
a rebase. If not, feel free to let me know if you want me to resend the
whole series with a base commit, or whatever easy for you.

Thanks,

Peter Xu (2):
fixup! mm: make HPAGE_PXD_* macros even if !THP
fixup! mm/gup: handle hugepd for follow_page()

include/linux/huge_mm.h | 16 ++-
mm/gup.c | 287 ++++++++++++++++++++--------------------
2 files changed, 154 insertions(+), 149 deletions(-)

--
2.44.0



2024-03-23 03:33:31

by Peter Xu

[permalink] [raw]
Subject: [PATCH 1/2] fixup! mm: make HPAGE_PXD_* macros even if !THP

From: Peter Xu <[email protected]>

[To be squashed into the corresponding patch]

I initially wanted to simply cover all macros to be under
PGTABLE_HAS_HUGE_LEAVES, but I found that it won't work, we must define
HPAGE_PMD_SHIFT even if PMD_SHIFT is not defined (!MMU case)..

The only solution is use the old trick of "({ BUILD_BUG(); 0; })".

Signed-off-by: Peter Xu <[email protected]>
---
include/linux/huge_mm.h | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index f451f0bdab97..d210c849ab7a 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -84,17 +84,23 @@ extern struct kobj_attribute shmem_enabled_attr;
#define thp_vma_allowable_order(vma, vm_flags, smaps, in_pf, enforce_sysfs, order) \
(!!thp_vma_allowable_orders(vma, vm_flags, smaps, in_pf, enforce_sysfs, BIT(order)))

+#ifdef CONFIG_PGTABLE_HAS_HUGE_LEAVES
#define HPAGE_PMD_SHIFT PMD_SHIFT
-#define HPAGE_PMD_SIZE ((1UL) << HPAGE_PMD_SHIFT)
-#define HPAGE_PMD_MASK (~(HPAGE_PMD_SIZE - 1))
+#define HPAGE_PUD_SHIFT PUD_SHIFT
+#else
+#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
+#define HPAGE_PUD_SHIFT ({ BUILD_BUG(); 0; })
+#endif
+
#define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
#define HPAGE_PMD_NR (1<<HPAGE_PMD_ORDER)
+#define HPAGE_PMD_MASK (~(HPAGE_PMD_SIZE - 1))
+#define HPAGE_PMD_SIZE ((1UL) << HPAGE_PMD_SHIFT)

-#define HPAGE_PUD_SHIFT PUD_SHIFT
-#define HPAGE_PUD_SIZE ((1UL) << HPAGE_PUD_SHIFT)
-#define HPAGE_PUD_MASK (~(HPAGE_PUD_SIZE - 1))
#define HPAGE_PUD_ORDER (HPAGE_PUD_SHIFT-PAGE_SHIFT)
#define HPAGE_PUD_NR (1<<HPAGE_PUD_ORDER)
+#define HPAGE_PUD_MASK (~(HPAGE_PUD_SIZE - 1))
+#define HPAGE_PUD_SIZE ((1UL) << HPAGE_PUD_SHIFT)

#ifdef CONFIG_TRANSPARENT_HUGEPAGE

--
2.44.0


2024-03-23 03:33:49

by Peter Xu

[permalink] [raw]
Subject: [PATCH 2/2] fixup! mm/gup: handle hugepd for follow_page()

From: Peter Xu <[email protected]>

The major issue is that now slow gup will reuse some fast gup functions to
parse hugepd entries. So we need to move hugepd and relevant functions out
of HAVE_FAST_GUP, but also under CONFIG_MMU.

Meanwhile, the helper record_subpages() can be used by either hugepd or
fast-gup section. To avoid "unused function" warnings we must provide a
macro to it, unfortunately.

Signed-off-by: Peter Xu <[email protected]>
---
mm/gup.c | 287 +++++++++++++++++++++++++++----------------------------
1 file changed, 143 insertions(+), 144 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 4cd349390477..fe9df268bef2 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -30,11 +30,6 @@ struct follow_page_context {
unsigned int page_mask;
};

-static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
- unsigned long addr, unsigned int pdshift,
- unsigned int flags,
- struct follow_page_context *ctx);
-
static inline void sanity_check_pinned_pages(struct page **pages,
unsigned long npages)
{
@@ -505,6 +500,149 @@ static inline void mm_set_has_pinned_flag(unsigned long *mm_flags)
}

#ifdef CONFIG_MMU
+
+#if defined(CONFIG_ARCH_HAS_HUGEPD) || defined(CONFIG_HAVE_FAST_GUP)
+static int record_subpages(struct page *page, unsigned long sz,
+ unsigned long addr, unsigned long end,
+ struct page **pages)
+{
+ struct page *start_page;
+ int nr;
+
+ start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT);
+ for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
+ pages[nr] = nth_page(start_page, nr);
+
+ return nr;
+}
+#endif /* CONFIG_ARCH_HAS_HUGEPD || CONFIG_HAVE_FAST_GUP */
+
+#ifdef CONFIG_ARCH_HAS_HUGEPD
+static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
+ unsigned long sz)
+{
+ unsigned long __boundary = (addr + sz) & ~(sz-1);
+ return (__boundary - 1 < end - 1) ? __boundary : end;
+}
+
+static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
+ unsigned long end, unsigned int flags,
+ struct page **pages, int *nr)
+{
+ unsigned long pte_end;
+ struct page *page;
+ struct folio *folio;
+ pte_t pte;
+ int refs;
+
+ pte_end = (addr + sz) & ~(sz-1);
+ if (pte_end < end)
+ end = pte_end;
+
+ pte = huge_ptep_get(ptep);
+
+ if (!pte_access_permitted(pte, flags & FOLL_WRITE))
+ return 0;
+
+ /* hugepages are never "special" */
+ VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+
+ page = pte_page(pte);
+ refs = record_subpages(page, sz, addr, end, pages + *nr);
+
+ folio = try_grab_folio(page, refs, flags);
+ if (!folio)
+ return 0;
+
+ if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) {
+ gup_put_folio(folio, refs, flags);
+ return 0;
+ }
+
+ if (!pte_write(pte) && gup_must_unshare(NULL, flags, &folio->page)) {
+ gup_put_folio(folio, refs, flags);
+ return 0;
+ }
+
+ *nr += refs;
+ folio_set_referenced(folio);
+ return 1;
+}
+
+/*
+ * NOTE: currently GUP for a hugepd is only possible on hugetlbfs file
+ * systems on Power, which does not have issue with folio writeback against
+ * GUP updates. When hugepd will be extended to support non-hugetlbfs or
+ * even anonymous memory, we need to do extra check as what we do with most
+ * of the other folios. See writable_file_mapping_allowed() and
+ * folio_fast_pin_allowed() for more information.
+ */
+static int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
+ unsigned int pdshift, unsigned long end, unsigned int flags,
+ struct page **pages, int *nr)
+{
+ pte_t *ptep;
+ unsigned long sz = 1UL << hugepd_shift(hugepd);
+ unsigned long next;
+
+ ptep = hugepte_offset(hugepd, addr, pdshift);
+ do {
+ next = hugepte_addr_end(addr, end, sz);
+ if (!gup_hugepte(ptep, sz, addr, end, flags, pages, nr))
+ return 0;
+ } while (ptep++, addr = next, addr != end);
+
+ return 1;
+}
+
+static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
+ unsigned long addr, unsigned int pdshift,
+ unsigned int flags,
+ struct follow_page_context *ctx)
+{
+ struct page *page;
+ struct hstate *h;
+ spinlock_t *ptl;
+ int nr = 0, ret;
+ pte_t *ptep;
+
+ /* Only hugetlb supports hugepd */
+ if (WARN_ON_ONCE(!is_vm_hugetlb_page(vma)))
+ return ERR_PTR(-EFAULT);
+
+ h = hstate_vma(vma);
+ ptep = hugepte_offset(hugepd, addr, pdshift);
+ ptl = huge_pte_lock(h, vma->vm_mm, ptep);
+ ret = gup_huge_pd(hugepd, addr, pdshift, addr + PAGE_SIZE,
+ flags, &page, &nr);
+ spin_unlock(ptl);
+
+ if (ret) {
+ WARN_ON_ONCE(nr != 1);
+ ctx->page_mask = (1U << huge_page_order(h)) - 1;
+ return page;
+ }
+
+ return NULL;
+}
+#else /* CONFIG_ARCH_HAS_HUGEPD */
+static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
+ unsigned int pdshift, unsigned long end, unsigned int flags,
+ struct page **pages, int *nr)
+{
+ return 0;
+}
+
+static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
+ unsigned long addr, unsigned int pdshift,
+ unsigned int flags,
+ struct follow_page_context *ctx)
+{
+ return NULL;
+}
+#endif /* CONFIG_ARCH_HAS_HUGEPD */
+
+
static struct page *no_page_table(struct vm_area_struct *vma,
unsigned int flags, unsigned long address)
{
@@ -2962,145 +3100,6 @@ static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr,
}
#endif

-static int record_subpages(struct page *page, unsigned long sz,
- unsigned long addr, unsigned long end,
- struct page **pages)
-{
- struct page *start_page;
- int nr;
-
- start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT);
- for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
- pages[nr] = nth_page(start_page, nr);
-
- return nr;
-}
-
-#ifdef CONFIG_ARCH_HAS_HUGEPD
-static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
- unsigned long sz)
-{
- unsigned long __boundary = (addr + sz) & ~(sz-1);
- return (__boundary - 1 < end - 1) ? __boundary : end;
-}
-
-static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
-{
- unsigned long pte_end;
- struct page *page;
- struct folio *folio;
- pte_t pte;
- int refs;
-
- pte_end = (addr + sz) & ~(sz-1);
- if (pte_end < end)
- end = pte_end;
-
- pte = huge_ptep_get(ptep);
-
- if (!pte_access_permitted(pte, flags & FOLL_WRITE))
- return 0;
-
- /* hugepages are never "special" */
- VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
-
- page = pte_page(pte);
- refs = record_subpages(page, sz, addr, end, pages + *nr);
-
- folio = try_grab_folio(page, refs, flags);
- if (!folio)
- return 0;
-
- if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) {
- gup_put_folio(folio, refs, flags);
- return 0;
- }
-
- if (!pte_write(pte) && gup_must_unshare(NULL, flags, &folio->page)) {
- gup_put_folio(folio, refs, flags);
- return 0;
- }
-
- *nr += refs;
- folio_set_referenced(folio);
- return 1;
-}
-
-/*
- * NOTE: currently GUP for a hugepd is only possible on hugetlbfs file
- * systems on Power, which does not have issue with folio writeback against
- * GUP updates. When hugepd will be extended to support non-hugetlbfs or
- * even anonymous memory, we need to do extra check as what we do with most
- * of the other folios. See writable_file_mapping_allowed() and
- * folio_fast_pin_allowed() for more information.
- */
-static int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
- unsigned int pdshift, unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
-{
- pte_t *ptep;
- unsigned long sz = 1UL << hugepd_shift(hugepd);
- unsigned long next;
-
- ptep = hugepte_offset(hugepd, addr, pdshift);
- do {
- next = hugepte_addr_end(addr, end, sz);
- if (!gup_hugepte(ptep, sz, addr, end, flags, pages, nr))
- return 0;
- } while (ptep++, addr = next, addr != end);
-
- return 1;
-}
-
-static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
- unsigned long addr, unsigned int pdshift,
- unsigned int flags,
- struct follow_page_context *ctx)
-{
- struct page *page;
- struct hstate *h;
- spinlock_t *ptl;
- int nr = 0, ret;
- pte_t *ptep;
-
- /* Only hugetlb supports hugepd */
- if (WARN_ON_ONCE(!is_vm_hugetlb_page(vma)))
- return ERR_PTR(-EFAULT);
-
- h = hstate_vma(vma);
- ptep = hugepte_offset(hugepd, addr, pdshift);
- ptl = huge_pte_lock(h, vma->vm_mm, ptep);
- ret = gup_huge_pd(hugepd, addr, pdshift, addr + PAGE_SIZE,
- flags, &page, &nr);
- spin_unlock(ptl);
-
- if (ret) {
- WARN_ON_ONCE(nr != 1);
- ctx->page_mask = (1U << huge_page_order(h)) - 1;
- return page;
- }
-
- return NULL;
-}
-#else
-static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
- unsigned int pdshift, unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
-{
- return 0;
-}
-
-static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
- unsigned long addr, unsigned int pdshift,
- unsigned int flags,
- struct follow_page_context *ctx)
-{
- return NULL;
-}
-#endif /* CONFIG_ARCH_HAS_HUGEPD */
-
static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
unsigned long end, unsigned int flags,
struct page **pages, int *nr)
--
2.44.0


2024-03-23 17:51:21

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm: small fixup series for mm-unstable

On Fri, 22 Mar 2024 23:33:08 -0400 [email protected] wrote:

> From: Peter Xu <[email protected]>
>
> Andrew,
>
> This is the small series that should fixes the two reported issues on
> mm-unstable for this series:
>
> https://lore.kernel.org/r/[email protected]
>
> This is build tested on (1) x86_64, allnoconfig+allmodconfig, (2) m68k,
> allnoconfig+allmodconfig.
>
> Please consider dropping below quickfix:
>
> mm-gup-handle-hugepd-for-follow_page-fix
>
> Then apply these two fixups.
>
> Sorry this "small" fixup series is not that small. Said that, I tested
> apply and the fixup should auto-squash all fine on current mm-unstable with
> a rebase. If not, feel free to let me know if you want me to resend the
> whole series with a base commit, or whatever easy for you.
>
> Thanks,

I confirmed this fixes the build issue I reported[1] yeterday.

Tested-by: SeongJae Park <[email protected]>

[1] https://lore.kernel.org/r/[email protected]


Thanks,
SJ

>
> Peter Xu (2):
> fixup! mm: make HPAGE_PXD_* macros even if !THP
> fixup! mm/gup: handle hugepd for follow_page()
>
> include/linux/huge_mm.h | 16 ++-
> mm/gup.c | 287 ++++++++++++++++++++--------------------
> 2 files changed, 154 insertions(+), 149 deletions(-)
>
> --
> 2.44.0