2022-11-01 22:53:03

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 0/9] convert hugetlb_cgroup helper functions to folios

This patch series continues the conversion of hugetlb code from being
managed in pages to folios by converting many of the hugetlb_cgroup helper
functions to use folios. This allows the core hugetlb functions to pass in
a folio to these helper functions.

This series depends on my previous patch series which begins the hugetlb
folio conversion[1], both series pass the LTP hugetlb test cases.


[1] https://lore.kernel.org/lkml/[email protected]/

v1 -> v2
-fix folio_test_hugetlb_freed() check in alloc_and_dissolve_huge_page
-add more folio conversions to unmap_and_move_huge_page in patch 9
-collect review-bys from Mike Kravetz
-rebased on next-20221101 and mm-unstable

Sidhartha Kumar (9):
mm/hugetlb_cgroup: convert __set_hugetlb_cgroup() to folios
mm/hugetlb_cgroup: convert hugetlb_cgroup_from_page() to folios
mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios
mm/hugetlb_cgroup: convert hugetlb_cgroup_migrate to folios
mm/hugetlb: convert isolate_or_dissolve_huge_page to folios
mm/hugetlb: convert free_huge_page to folios
mm/hugetlb_cgroup: convert hugetlb_cgroup_uncharge_page() to folios
mm/hugeltb_cgroup: convert hugetlb_cgroup_commit_charge*() to folios
mm/hugetlb: convert move_hugetlb_state() to folios

include/linux/hugetlb.h | 6 +-
include/linux/hugetlb_cgroup.h | 85 ++++++++++++------------
mm/hugetlb.c | 115 ++++++++++++++++++---------------
mm/hugetlb_cgroup.c | 63 +++++++++---------
mm/migrate.c | 12 ++--
5 files changed, 149 insertions(+), 132 deletions(-)

--
2.31.1



2022-11-01 22:56:40

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 4/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_migrate to folios

Cleans up intermediate page to folio conversion code in
hugetlb_cgroup_migrate() by changing its arguments from pages to folios.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
include/linux/hugetlb_cgroup.h | 8 ++++----
mm/hugetlb.c | 2 +-
mm/hugetlb_cgroup.c | 8 +++-----
3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index a7e3540f7f38..789b6fef176d 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -177,8 +177,8 @@ extern void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv,
bool region_del);

extern void hugetlb_cgroup_file_init(void) __init;
-extern void hugetlb_cgroup_migrate(struct page *oldhpage,
- struct page *newhpage);
+extern void hugetlb_cgroup_migrate(struct folio *old_folio,
+ struct folio *new_folio);

#else
static inline void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv,
@@ -286,8 +286,8 @@ static inline void hugetlb_cgroup_file_init(void)
{
}

-static inline void hugetlb_cgroup_migrate(struct page *oldhpage,
- struct page *newhpage)
+static inline void hugetlb_cgroup_migrate(struct folio *old_folio,
+ struct folio *new_folio)
{
}

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a6384fb0b57f..2a48feadb41c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7290,7 +7290,7 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
{
struct hstate *h = page_hstate(oldpage);

- hugetlb_cgroup_migrate(oldpage, newpage);
+ hugetlb_cgroup_migrate(page_folio(oldpage), page_folio(newpage));
set_page_owner_migrate_reason(newpage, reason);

/*
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 692b23b5d423..351ffb40261c 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -888,13 +888,11 @@ void __init hugetlb_cgroup_file_init(void)
* hugetlb_lock will make sure a parallel cgroup rmdir won't happen
* when we migrate hugepages
*/
-void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
+void hugetlb_cgroup_migrate(struct folio *old_folio, struct folio *new_folio)
{
struct hugetlb_cgroup *h_cg;
struct hugetlb_cgroup *h_cg_rsvd;
- struct hstate *h = page_hstate(oldhpage);
- struct folio *old_folio = page_folio(oldhpage);
- struct folio *new_folio = page_folio(newhpage);
+ struct hstate *h = folio_hstate(old_folio);

if (hugetlb_cgroup_disabled())
return;
@@ -908,7 +906,7 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
/* move the h_cg details to new cgroup */
set_hugetlb_cgroup(new_folio, h_cg);
set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd);
- list_move(&newhpage->lru, &h->hugepage_activelist);
+ list_move(&new_folio->lru, &h->hugepage_activelist);
spin_unlock_irq(&hugetlb_lock);
return;
}
--
2.31.1


2022-11-01 22:57:08

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 6/9] mm/hugetlb: convert free_huge_page to folios

Use folios inside free_huge_page(), this is in preparation for converting
hugetlb_cgroup_uncharge_page() to take in a folio.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
mm/hugetlb.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bcc39d2613b2..387b8d74107d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1688,21 +1688,22 @@ void free_huge_page(struct page *page)
* Can't pass hstate in here because it is called from the
* compound page destructor.
*/
- struct hstate *h = page_hstate(page);
- int nid = page_to_nid(page);
- struct hugepage_subpool *spool = hugetlb_page_subpool(page);
+ struct folio *folio = page_folio(page);
+ struct hstate *h = folio_hstate(folio);
+ int nid = folio_nid(folio);
+ struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
bool restore_reserve;
unsigned long flags;

- VM_BUG_ON_PAGE(page_count(page), page);
- VM_BUG_ON_PAGE(page_mapcount(page), page);
+ VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
+ VM_BUG_ON_FOLIO(folio_mapcount(folio), folio);

- hugetlb_set_page_subpool(page, NULL);
- if (PageAnon(page))
- __ClearPageAnonExclusive(page);
- page->mapping = NULL;
- restore_reserve = HPageRestoreReserve(page);
- ClearHPageRestoreReserve(page);
+ hugetlb_set_folio_subpool(folio, NULL);
+ if (folio_test_anon(folio))
+ __ClearPageAnonExclusive(&folio->page);
+ folio->mapping = NULL;
+ restore_reserve = folio_test_hugetlb_restore_reserve(folio);
+ folio_clear_hugetlb_restore_reserve(folio);

/*
* If HPageRestoreReserve was set on page, page allocation consumed a
@@ -1724,7 +1725,7 @@ void free_huge_page(struct page *page)
}

spin_lock_irqsave(&hugetlb_lock, flags);
- ClearHPageMigratable(page);
+ folio_clear_hugetlb_migratable(folio);
hugetlb_cgroup_uncharge_page(hstate_index(h),
pages_per_huge_page(h), page);
hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
@@ -1732,7 +1733,7 @@ void free_huge_page(struct page *page)
if (restore_reserve)
h->resv_huge_pages++;

- if (HPageTemporary(page)) {
+ if (folio_test_hugetlb_temporary(folio)) {
remove_hugetlb_page(h, page, false);
spin_unlock_irqrestore(&hugetlb_lock, flags);
update_and_free_page(h, page, true);
--
2.31.1


2022-11-01 22:58:28

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 3/9] mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios

Allows __prep_new_huge_page() to operate on a folio by converting
set_hugetlb_cgroup*() to take in a folio.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
include/linux/hugetlb_cgroup.h | 12 ++++++------
mm/hugetlb.c | 33 +++++++++++++++++++--------------
mm/hugetlb_cgroup.c | 11 ++++++-----
3 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index feb2edafc8b6..a7e3540f7f38 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -112,16 +112,16 @@ static inline void __set_hugetlb_cgroup(struct folio *folio,
(unsigned long)h_cg);
}

-static inline void set_hugetlb_cgroup(struct page *page,
+static inline void set_hugetlb_cgroup(struct folio *folio,
struct hugetlb_cgroup *h_cg)
{
- __set_hugetlb_cgroup(page_folio(page), h_cg, false);
+ __set_hugetlb_cgroup(folio, h_cg, false);
}

-static inline void set_hugetlb_cgroup_rsvd(struct page *page,
+static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
struct hugetlb_cgroup *h_cg)
{
- __set_hugetlb_cgroup(page_folio(page), h_cg, true);
+ __set_hugetlb_cgroup(folio, h_cg, true);
}

static inline bool hugetlb_cgroup_disabled(void)
@@ -199,12 +199,12 @@ hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
return NULL;
}

-static inline void set_hugetlb_cgroup(struct page *page,
+static inline void set_hugetlb_cgroup(struct folio *folio,
struct hugetlb_cgroup *h_cg)
{
}

-static inline void set_hugetlb_cgroup_rsvd(struct page *page,
+static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
struct hugetlb_cgroup *h_cg)
{
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 27b87dc85c48..a6384fb0b57f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1758,19 +1758,21 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
h->nr_huge_pages_node[nid]++;
}

-static void __prep_new_huge_page(struct hstate *h, struct page *page)
+static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
{
- hugetlb_vmemmap_optimize(h, page);
- INIT_LIST_HEAD(&page->lru);
- set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
- hugetlb_set_page_subpool(page, NULL);
- set_hugetlb_cgroup(page, NULL);
- set_hugetlb_cgroup_rsvd(page, NULL);
+ hugetlb_vmemmap_optimize(h, &folio->page);
+ INIT_LIST_HEAD(&folio->lru);
+ folio->_folio_dtor = HUGETLB_PAGE_DTOR;
+ hugetlb_set_folio_subpool(folio, NULL);
+ set_hugetlb_cgroup(folio, NULL);
+ set_hugetlb_cgroup_rsvd(folio, NULL);
}

static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
{
- __prep_new_huge_page(h, page);
+ struct folio *folio = page_folio(page);
+
+ __prep_new_hugetlb_folio(h, folio);
spin_lock_irq(&hugetlb_lock);
__prep_account_new_huge_page(h, nid);
spin_unlock_irq(&hugetlb_lock);
@@ -2731,8 +2733,10 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
struct list_head *list)
{
gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
- int nid = page_to_nid(old_page);
+ struct folio *old_folio = page_folio(old_page);
+ int nid = folio_nid(old_folio);
struct page *new_page;
+ struct folio *new_folio;
int ret = 0;

/*
@@ -2745,16 +2749,17 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL);
if (!new_page)
return -ENOMEM;
- __prep_new_huge_page(h, new_page);
+ new_folio = page_folio(new_page);
+ __prep_new_hugetlb_folio(h, new_folio);

retry:
spin_lock_irq(&hugetlb_lock);
- if (!PageHuge(old_page)) {
+ if (!folio_test_hugetlb(old_folio)) {
/*
* Freed from under us. Drop new_page too.
*/
goto free_new;
- } else if (page_count(old_page)) {
+ } else if (folio_ref_count(old_folio)) {
/*
* Someone has grabbed the page, try to isolate it here.
* Fail with -EBUSY if not possible.
@@ -2763,7 +2768,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
ret = isolate_hugetlb(old_page, list);
spin_lock_irq(&hugetlb_lock);
goto free_new;
- } else if (!HPageFreed(old_page)) {
+ } else if (!folio_test_hugetlb_freed(old_folio)) {
/*
* Page's refcount is 0 but it has not been enqueued in the
* freelist yet. Race window is small, so we can succeed here if
@@ -2801,7 +2806,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
free_new:
spin_unlock_irq(&hugetlb_lock);
/* Page has a zero ref count, but needs a ref to be freed */
- set_page_refcounted(new_page);
+ folio_ref_unfreeze(new_folio, 1);
update_and_free_page(h, new_page, false);

return ret;
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 600c98560a0f..692b23b5d423 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -212,7 +212,7 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
/* Take the pages off the local counter */
page_counter_cancel(counter, nr_pages);

- set_hugetlb_cgroup(page, parent);
+ set_hugetlb_cgroup(folio, parent);
out:
return;
}
@@ -894,6 +894,7 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
struct hugetlb_cgroup *h_cg_rsvd;
struct hstate *h = page_hstate(oldhpage);
struct folio *old_folio = page_folio(oldhpage);
+ struct folio *new_folio = page_folio(newhpage);

if (hugetlb_cgroup_disabled())
return;
@@ -901,12 +902,12 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
spin_lock_irq(&hugetlb_lock);
h_cg = hugetlb_cgroup_from_folio(old_folio);
h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
- set_hugetlb_cgroup(oldhpage, NULL);
- set_hugetlb_cgroup_rsvd(oldhpage, NULL);
+ set_hugetlb_cgroup(old_folio, NULL);
+ set_hugetlb_cgroup_rsvd(old_folio, NULL);

/* move the h_cg details to new cgroup */
- set_hugetlb_cgroup(newhpage, h_cg);
- set_hugetlb_cgroup_rsvd(newhpage, h_cg_rsvd);
+ set_hugetlb_cgroup(new_folio, h_cg);
+ set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd);
list_move(&newhpage->lru, &h->hugepage_activelist);
spin_unlock_irq(&hugetlb_lock);
return;
--
2.31.1


2022-11-01 22:58:57

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 7/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_uncharge_page() to folios

Continue to use a folio inside free_huge_page() by converting
hugetlb_cgroup_uncharge_page*() to folios.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
include/linux/hugetlb_cgroup.h | 16 ++++++++--------
mm/hugetlb.c | 15 +++++++++------
mm/hugetlb_cgroup.c | 21 ++++++++++-----------
3 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 789b6fef176d..c70f92fe493e 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -158,10 +158,10 @@ extern void hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
extern void hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
struct hugetlb_cgroup *h_cg,
struct page *page);
-extern void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
- struct page *page);
-extern void hugetlb_cgroup_uncharge_page_rsvd(int idx, unsigned long nr_pages,
- struct page *page);
+extern void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
+ struct folio *folio);
+extern void hugetlb_cgroup_uncharge_folio_rsvd(int idx, unsigned long nr_pages,
+ struct folio *folio);

extern void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
struct hugetlb_cgroup *h_cg);
@@ -254,14 +254,14 @@ hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
{
}

-static inline void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
- struct page *page)
+static inline void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
+ struct folio *folio)
{
}

-static inline void hugetlb_cgroup_uncharge_page_rsvd(int idx,
+static inline void hugetlb_cgroup_uncharge_folio_rsvd(int idx,
unsigned long nr_pages,
- struct page *page)
+ struct folio *folio)
{
}
static inline void hugetlb_cgroup_uncharge_cgroup(int idx,
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 387b8d74107d..2ecc0a6cf883 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1726,10 +1726,10 @@ void free_huge_page(struct page *page)

spin_lock_irqsave(&hugetlb_lock, flags);
folio_clear_hugetlb_migratable(folio);
- hugetlb_cgroup_uncharge_page(hstate_index(h),
- pages_per_huge_page(h), page);
- hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
- pages_per_huge_page(h), page);
+ hugetlb_cgroup_uncharge_folio(hstate_index(h),
+ pages_per_huge_page(h), folio);
+ hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
+ pages_per_huge_page(h), folio);
if (restore_reserve)
h->resv_huge_pages++;

@@ -2855,6 +2855,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
struct hugepage_subpool *spool = subpool_vma(vma);
struct hstate *h = hstate_vma(vma);
struct page *page;
+ struct folio *folio;
long map_chg, map_commit;
long gbl_chg;
int ret, idx;
@@ -2918,6 +2919,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
* a reservation exists for the allocation.
*/
page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve, gbl_chg);
+
if (!page) {
spin_unlock_irq(&hugetlb_lock);
page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
@@ -2932,6 +2934,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
set_page_refcounted(page);
/* Fall through */
}
+ folio = page_folio(page);
hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
/* If allocation is not consuming a reservation, also store the
* hugetlb_cgroup pointer on the page.
@@ -2961,8 +2964,8 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
rsv_adjust = hugepage_subpool_put_pages(spool, 1);
hugetlb_acct_memory(h, -rsv_adjust);
if (deferred_reserve)
- hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
- pages_per_huge_page(h), page);
+ hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
+ pages_per_huge_page(h), folio);
}
return page;

diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 351ffb40261c..7793401acc12 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -349,11 +349,10 @@ void hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
/*
* Should be called with hugetlb_lock held
*/
-static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
- struct page *page, bool rsvd)
+static void __hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
+ struct folio *folio, bool rsvd)
{
struct hugetlb_cgroup *h_cg;
- struct folio *folio = page_folio(page);

if (hugetlb_cgroup_disabled())
return;
@@ -371,27 +370,27 @@ static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
css_put(&h_cg->css);
else {
unsigned long usage =
- h_cg->nodeinfo[page_to_nid(page)]->usage[idx];
+ h_cg->nodeinfo[folio_nid(folio)]->usage[idx];
/*
* This write is not atomic due to fetching usage and writing
* to it, but that's fine because we call this with
* hugetlb_lock held anyway.
*/
- WRITE_ONCE(h_cg->nodeinfo[page_to_nid(page)]->usage[idx],
+ WRITE_ONCE(h_cg->nodeinfo[folio_nid(folio)]->usage[idx],
usage - nr_pages);
}
}

-void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
- struct page *page)
+void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
+ struct folio *folio)
{
- __hugetlb_cgroup_uncharge_page(idx, nr_pages, page, false);
+ __hugetlb_cgroup_uncharge_folio(idx, nr_pages, folio, false);
}

-void hugetlb_cgroup_uncharge_page_rsvd(int idx, unsigned long nr_pages,
- struct page *page)
+void hugetlb_cgroup_uncharge_folio_rsvd(int idx, unsigned long nr_pages,
+ struct folio *folio)
{
- __hugetlb_cgroup_uncharge_page(idx, nr_pages, page, true);
+ __hugetlb_cgroup_uncharge_folio(idx, nr_pages, folio, true);
}

static void __hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
--
2.31.1


2022-11-01 23:38:43

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 1/9] mm/hugetlb_cgroup: convert __set_hugetlb_cgroup() to folios

Change __set_hugetlb_cgroup() to use folios so it is explicit that the
function operates on a head page.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
include/linux/hugetlb_cgroup.h | 14 +++++++-------
mm/hugetlb_cgroup.c | 4 ++--
2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 630cd255d0cf..7576e9ed8afe 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -90,31 +90,31 @@ hugetlb_cgroup_from_page_rsvd(struct page *page)
return __hugetlb_cgroup_from_page(page, true);
}

-static inline void __set_hugetlb_cgroup(struct page *page,
+static inline void __set_hugetlb_cgroup(struct folio *folio,
struct hugetlb_cgroup *h_cg, bool rsvd)
{
- VM_BUG_ON_PAGE(!PageHuge(page), page);
+ VM_BUG_ON_FOLIO(!folio_test_hugetlb(folio), folio);

- if (compound_order(page) < HUGETLB_CGROUP_MIN_ORDER)
+ if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
return;
if (rsvd)
- set_page_private(page + SUBPAGE_INDEX_CGROUP_RSVD,
+ set_page_private(folio_page(folio, SUBPAGE_INDEX_CGROUP_RSVD),
(unsigned long)h_cg);
else
- set_page_private(page + SUBPAGE_INDEX_CGROUP,
+ set_page_private(folio_page(folio, SUBPAGE_INDEX_CGROUP),
(unsigned long)h_cg);
}

static inline void set_hugetlb_cgroup(struct page *page,
struct hugetlb_cgroup *h_cg)
{
- __set_hugetlb_cgroup(page, h_cg, false);
+ __set_hugetlb_cgroup(page_folio(page), h_cg, false);
}

static inline void set_hugetlb_cgroup_rsvd(struct page *page,
struct hugetlb_cgroup *h_cg)
{
- __set_hugetlb_cgroup(page, h_cg, true);
+ __set_hugetlb_cgroup(page_folio(page), h_cg, true);
}

static inline bool hugetlb_cgroup_disabled(void)
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index c86691c431fd..81675f8f44e9 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -317,7 +317,7 @@ static void __hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
if (hugetlb_cgroup_disabled() || !h_cg)
return;

- __set_hugetlb_cgroup(page, h_cg, rsvd);
+ __set_hugetlb_cgroup(page_folio(page), h_cg, rsvd);
if (!rsvd) {
unsigned long usage =
h_cg->nodeinfo[page_to_nid(page)]->usage[idx];
@@ -359,7 +359,7 @@ static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
h_cg = __hugetlb_cgroup_from_page(page, rsvd);
if (unlikely(!h_cg))
return;
- __set_hugetlb_cgroup(page, NULL, rsvd);
+ __set_hugetlb_cgroup(page_folio(page), NULL, rsvd);

page_counter_uncharge(__hugetlb_cgroup_counter_from_cgroup(h_cg, idx,
rsvd),
--
2.31.1


2022-11-01 23:40:37

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios

Removes a call to compound_head() by using a folio when operating on the
head page of a hugetlb compound page.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
mm/hugetlb.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2a48feadb41c..bcc39d2613b2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
{
struct hstate *h;
- struct page *head;
+ struct folio *folio = page_folio(page);
int ret = -EBUSY;

/*
@@ -2824,9 +2824,8 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
* Return success when racing as if we dissolved the page ourselves.
*/
spin_lock_irq(&hugetlb_lock);
- if (PageHuge(page)) {
- head = compound_head(page);
- h = page_hstate(head);
+ if (folio_test_hugetlb(folio)) {
+ h = folio_hstate(folio);
} else {
spin_unlock_irq(&hugetlb_lock);
return 0;
@@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
if (hstate_is_gigantic(h))
return -ENOMEM;

- if (page_count(head) && !isolate_hugetlb(head, list))
+ if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
ret = 0;
- else if (!page_count(head))
- ret = alloc_and_dissolve_huge_page(h, head, list);
+ else if (!folio_ref_count(folio))
+ ret = alloc_and_dissolve_huge_page(h, &folio->page, list);

return ret;
}
--
2.31.1


2022-11-01 23:42:11

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 2/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_from_page() to folios

Introduce folios in __remove_hugetlb_page() by converting
hugetlb_cgroup_from_page() to use folios.

Also gets rid of unsed hugetlb_cgroup_from_page_resv() function.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
include/linux/hugetlb_cgroup.h | 39 +++++++++++++++++-----------------
mm/hugetlb.c | 5 +++--
mm/hugetlb_cgroup.c | 13 +++++++-----
3 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 7576e9ed8afe..feb2edafc8b6 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -67,27 +67,34 @@ struct hugetlb_cgroup {
};

static inline struct hugetlb_cgroup *
-__hugetlb_cgroup_from_page(struct page *page, bool rsvd)
+__hugetlb_cgroup_from_folio(struct folio *folio, bool rsvd)
{
- VM_BUG_ON_PAGE(!PageHuge(page), page);
+ struct page *tail;

- if (compound_order(page) < HUGETLB_CGROUP_MIN_ORDER)
+ VM_BUG_ON_FOLIO(!folio_test_hugetlb(folio), folio);
+ if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
return NULL;
- if (rsvd)
- return (void *)page_private(page + SUBPAGE_INDEX_CGROUP_RSVD);
- else
- return (void *)page_private(page + SUBPAGE_INDEX_CGROUP);
+
+ if (rsvd) {
+ tail = folio_page(folio, SUBPAGE_INDEX_CGROUP_RSVD);
+ return (void *)page_private(tail);
+ }
+
+ else {
+ tail = folio_page(folio, SUBPAGE_INDEX_CGROUP);
+ return (void *)page_private(tail);
+ }
}

-static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page *page)
+static inline struct hugetlb_cgroup *hugetlb_cgroup_from_folio(struct folio *folio)
{
- return __hugetlb_cgroup_from_page(page, false);
+ return __hugetlb_cgroup_from_folio(folio, false);
}

static inline struct hugetlb_cgroup *
-hugetlb_cgroup_from_page_rsvd(struct page *page)
+hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
{
- return __hugetlb_cgroup_from_page(page, true);
+ return __hugetlb_cgroup_from_folio(folio, true);
}

static inline void __set_hugetlb_cgroup(struct folio *folio,
@@ -181,19 +188,13 @@ static inline void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv,
{
}

-static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page *page)
-{
- return NULL;
-}
-
-static inline struct hugetlb_cgroup *
-hugetlb_cgroup_from_page_resv(struct page *page)
+static inline struct hugetlb_cgroup *hugetlb_cgroup_from_folio(struct folio *folio)
{
return NULL;
}

static inline struct hugetlb_cgroup *
-hugetlb_cgroup_from_page_rsvd(struct page *page)
+hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
{
return NULL;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 518dfd0a7dbe..27b87dc85c48 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1434,9 +1434,10 @@ static void __remove_hugetlb_page(struct hstate *h, struct page *page,
bool demote)
{
int nid = page_to_nid(page);
+ struct folio *folio = page_folio(page);

- VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page), page);
- VM_BUG_ON_PAGE(hugetlb_cgroup_from_page_rsvd(page), page);
+ VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio(folio), folio);
+ VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio_rsvd(folio), folio);

lockdep_assert_held(&hugetlb_lock);
if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 81675f8f44e9..600c98560a0f 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -191,8 +191,9 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
struct page_counter *counter;
struct hugetlb_cgroup *page_hcg;
struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg);
+ struct folio *folio = page_folio(page);

- page_hcg = hugetlb_cgroup_from_page(page);
+ page_hcg = hugetlb_cgroup_from_folio(folio);
/*
* We can have pages in active list without any cgroup
* ie, hugepage with less than 3 pages. We can safely
@@ -352,14 +353,15 @@ static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
struct page *page, bool rsvd)
{
struct hugetlb_cgroup *h_cg;
+ struct folio *folio = page_folio(page);

if (hugetlb_cgroup_disabled())
return;
lockdep_assert_held(&hugetlb_lock);
- h_cg = __hugetlb_cgroup_from_page(page, rsvd);
+ h_cg = __hugetlb_cgroup_from_folio(folio, rsvd);
if (unlikely(!h_cg))
return;
- __set_hugetlb_cgroup(page_folio(page), NULL, rsvd);
+ __set_hugetlb_cgroup(folio, NULL, rsvd);

page_counter_uncharge(__hugetlb_cgroup_counter_from_cgroup(h_cg, idx,
rsvd),
@@ -891,13 +893,14 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
struct hugetlb_cgroup *h_cg;
struct hugetlb_cgroup *h_cg_rsvd;
struct hstate *h = page_hstate(oldhpage);
+ struct folio *old_folio = page_folio(oldhpage);

if (hugetlb_cgroup_disabled())
return;

spin_lock_irq(&hugetlb_lock);
- h_cg = hugetlb_cgroup_from_page(oldhpage);
- h_cg_rsvd = hugetlb_cgroup_from_page_rsvd(oldhpage);
+ h_cg = hugetlb_cgroup_from_folio(old_folio);
+ h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
set_hugetlb_cgroup(oldhpage, NULL);
set_hugetlb_cgroup_rsvd(oldhpage, NULL);

--
2.31.1


2022-11-01 23:58:29

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH v2 9/9] mm/hugetlb: convert move_hugetlb_state() to folios

Clean up unmap_and_move_huge_page() by converting move_hugetlb_state() to
take in folios.

Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
---
include/linux/hugetlb.h | 6 +++---
mm/hugetlb.c | 22 ++++++++++++----------
mm/migrate.c | 4 ++--
3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index d81f139193aa..375cd57721d6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -184,7 +184,7 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb, bool unpoison);
int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
bool *migratable_cleared);
void putback_active_hugepage(struct page *page);
-void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason);
+void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
void free_huge_page(struct page *page);
void hugetlb_fix_reserve_counts(struct inode *inode);
extern struct mutex *hugetlb_fault_mutex_table;
@@ -440,8 +440,8 @@ static inline void putback_active_hugepage(struct page *page)
{
}

-static inline void move_hugetlb_state(struct page *oldpage,
- struct page *newpage, int reason)
+static inline void move_hugetlb_state(struct folio *old_folio,
+ struct folio *new_folio, int reason)
{
}

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2ecc0a6cf883..2ab8f3b7132a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7289,15 +7289,15 @@ void putback_active_hugepage(struct page *page)
put_page(page);
}

-void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
+void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason)
{
- struct hstate *h = page_hstate(oldpage);
+ struct hstate *h = folio_hstate(old_folio);

- hugetlb_cgroup_migrate(page_folio(oldpage), page_folio(newpage));
- set_page_owner_migrate_reason(newpage, reason);
+ hugetlb_cgroup_migrate(old_folio, new_folio);
+ set_page_owner_migrate_reason(&new_folio->page, reason);

/*
- * transfer temporary state of the new huge page. This is
+ * transfer temporary state of the new hugetlb folio. This is
* reverse to other transitions because the newpage is going to
* be final while the old one will be freed so it takes over
* the temporary status.
@@ -7306,12 +7306,14 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
* here as well otherwise the global surplus count will not match
* the per-node's.
*/
- if (HPageTemporary(newpage)) {
- int old_nid = page_to_nid(oldpage);
- int new_nid = page_to_nid(newpage);
+ if (folio_test_hugetlb_temporary(new_folio)) {
+ int old_nid = folio_nid(old_folio);
+ int new_nid = folio_nid(new_folio);
+
+
+ folio_set_hugetlb_temporary(old_folio);
+ folio_clear_hugetlb_temporary(new_folio);

- SetHPageTemporary(oldpage);
- ClearHPageTemporary(newpage);

/*
* There is no need to transfer the per-node surplus state
diff --git a/mm/migrate.c b/mm/migrate.c
index d7db4fd97d8e..81f9a36c754d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1278,7 +1278,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
* folio_mapping() set, hugetlbfs specific move page routine will not
* be called and we could leak usage counts for subpools.
*/
- if (hugetlb_page_subpool(hpage) && !folio_mapping(src)) {
+ if (hugetlb_folio_subpool(src) && !folio_mapping(src)) {
rc = -EBUSY;
goto out_unlock;
}
@@ -1328,7 +1328,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
put_anon_vma(anon_vma);

if (rc == MIGRATEPAGE_SUCCESS) {
- move_hugetlb_state(hpage, new_hpage, reason);
+ move_hugetlb_state(src, dst, reason);
put_new_page = NULL;
}

--
2.31.1


2022-11-02 03:39:23

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] mm/hugetlb_cgroup: convert __set_hugetlb_cgroup() to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Change __set_hugetlb_cgroup() to use folios so it is explicit that the
> function operates on a head page.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

Thanks.


2022-11-02 06:55:34

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 2/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_from_page() to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Introduce folios in __remove_hugetlb_page() by converting
> hugetlb_cgroup_from_page() to use folios.
>
> Also gets rid of unsed hugetlb_cgroup_from_page_resv() function.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

A nit below.

> ---
> include/linux/hugetlb_cgroup.h | 39 +++++++++++++++++-----------------
> mm/hugetlb.c | 5 +++--
> mm/hugetlb_cgroup.c | 13 +++++++-----
> 3 files changed, 31 insertions(+), 26 deletions(-)
>
> diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
> index 7576e9ed8afe..feb2edafc8b6 100644
> --- a/include/linux/hugetlb_cgroup.h
> +++ b/include/linux/hugetlb_cgroup.h
> @@ -67,27 +67,34 @@ struct hugetlb_cgroup {
> };
>
> static inline struct hugetlb_cgroup *
> -__hugetlb_cgroup_from_page(struct page *page, bool rsvd)
> +__hugetlb_cgroup_from_folio(struct folio *folio, bool rsvd)
> {
> - VM_BUG_ON_PAGE(!PageHuge(page), page);
> + struct page *tail;
>
> - if (compound_order(page) < HUGETLB_CGROUP_MIN_ORDER)
> + VM_BUG_ON_FOLIO(!folio_test_hugetlb(folio), folio);
> + if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
> return NULL;
> - if (rsvd)
> - return (void *)page_private(page + SUBPAGE_INDEX_CGROUP_RSVD);
> - else
> - return (void *)page_private(page + SUBPAGE_INDEX_CGROUP);
> +
> + if (rsvd) {
> + tail = folio_page(folio, SUBPAGE_INDEX_CGROUP_RSVD);
> + return (void *)page_private(tail);
> + }
> +
> + else {
> + tail = folio_page(folio, SUBPAGE_INDEX_CGROUP);
> + return (void *)page_private(tail);
> + }

I suggest reworking the code here to the following (Looks simple for me).

int i = rsvd ? SUBPAGE_INDEX_CGROUP_RSVD : SUBPAGE_INDEX_CGROUP;

return (void *)page_private(folio_page(folio, i));

> }
>
> -static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page *page)
> +static inline struct hugetlb_cgroup *hugetlb_cgroup_from_folio(struct folio *folio)
> {
> - return __hugetlb_cgroup_from_page(page, false);
> + return __hugetlb_cgroup_from_folio(folio, false);
> }
>
> static inline struct hugetlb_cgroup *
> -hugetlb_cgroup_from_page_rsvd(struct page *page)
> +hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
> {
> - return __hugetlb_cgroup_from_page(page, true);
> + return __hugetlb_cgroup_from_folio(folio, true);
> }
>
> static inline void __set_hugetlb_cgroup(struct folio *folio,
> @@ -181,19 +188,13 @@ static inline void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv,
> {
> }
>
> -static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page *page)
> -{
> - return NULL;
> -}
> -
> -static inline struct hugetlb_cgroup *
> -hugetlb_cgroup_from_page_resv(struct page *page)
> +static inline struct hugetlb_cgroup *hugetlb_cgroup_from_folio(struct folio *folio)
> {
> return NULL;
> }
>
> static inline struct hugetlb_cgroup *
> -hugetlb_cgroup_from_page_rsvd(struct page *page)
> +hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
> {
> return NULL;
> }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 518dfd0a7dbe..27b87dc85c48 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1434,9 +1434,10 @@ static void __remove_hugetlb_page(struct hstate *h, struct page *page,
> bool demote)
> {
> int nid = page_to_nid(page);
> + struct folio *folio = page_folio(page);
>
> - VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page), page);
> - VM_BUG_ON_PAGE(hugetlb_cgroup_from_page_rsvd(page), page);
> + VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio(folio), folio);
> + VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio_rsvd(folio), folio);
>
> lockdep_assert_held(&hugetlb_lock);
> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 81675f8f44e9..600c98560a0f 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -191,8 +191,9 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
> struct page_counter *counter;
> struct hugetlb_cgroup *page_hcg;
> struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg);
> + struct folio *folio = page_folio(page);
>
> - page_hcg = hugetlb_cgroup_from_page(page);
> + page_hcg = hugetlb_cgroup_from_folio(folio);
> /*
> * We can have pages in active list without any cgroup
> * ie, hugepage with less than 3 pages. We can safely
> @@ -352,14 +353,15 @@ static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> struct page *page, bool rsvd)
> {
> struct hugetlb_cgroup *h_cg;
> + struct folio *folio = page_folio(page);
>
> if (hugetlb_cgroup_disabled())
> return;
> lockdep_assert_held(&hugetlb_lock);
> - h_cg = __hugetlb_cgroup_from_page(page, rsvd);
> + h_cg = __hugetlb_cgroup_from_folio(folio, rsvd);
> if (unlikely(!h_cg))
> return;
> - __set_hugetlb_cgroup(page_folio(page), NULL, rsvd);
> + __set_hugetlb_cgroup(folio, NULL, rsvd);
>
> page_counter_uncharge(__hugetlb_cgroup_counter_from_cgroup(h_cg, idx,
> rsvd),
> @@ -891,13 +893,14 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
> struct hugetlb_cgroup *h_cg;
> struct hugetlb_cgroup *h_cg_rsvd;
> struct hstate *h = page_hstate(oldhpage);
> + struct folio *old_folio = page_folio(oldhpage);
>
> if (hugetlb_cgroup_disabled())
> return;
>
> spin_lock_irq(&hugetlb_lock);
> - h_cg = hugetlb_cgroup_from_page(oldhpage);
> - h_cg_rsvd = hugetlb_cgroup_from_page_rsvd(oldhpage);
> + h_cg = hugetlb_cgroup_from_folio(old_folio);
> + h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
> set_hugetlb_cgroup(oldhpage, NULL);
> set_hugetlb_cgroup_rsvd(oldhpage, NULL);
>
> --
> 2.31.1
>
>


2022-11-02 07:16:34

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Allows __prep_new_huge_page() to operate on a folio by converting
> set_hugetlb_cgroup*() to take in a folio.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> ---
> include/linux/hugetlb_cgroup.h | 12 ++++++------
> mm/hugetlb.c | 33 +++++++++++++++++++--------------
> mm/hugetlb_cgroup.c | 11 ++++++-----
> 3 files changed, 31 insertions(+), 25 deletions(-)
>
> diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
> index feb2edafc8b6..a7e3540f7f38 100644
> --- a/include/linux/hugetlb_cgroup.h
> +++ b/include/linux/hugetlb_cgroup.h
> @@ -112,16 +112,16 @@ static inline void __set_hugetlb_cgroup(struct folio *folio,
> (unsigned long)h_cg);
> }
>
> -static inline void set_hugetlb_cgroup(struct page *page,
> +static inline void set_hugetlb_cgroup(struct folio *folio,
> struct hugetlb_cgroup *h_cg)
> {
> - __set_hugetlb_cgroup(page_folio(page), h_cg, false);
> + __set_hugetlb_cgroup(folio, h_cg, false);
> }
>
> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
> struct hugetlb_cgroup *h_cg)
> {
> - __set_hugetlb_cgroup(page_folio(page), h_cg, true);
> + __set_hugetlb_cgroup(folio, h_cg, true);
> }
>
> static inline bool hugetlb_cgroup_disabled(void)
> @@ -199,12 +199,12 @@ hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
> return NULL;
> }
>
> -static inline void set_hugetlb_cgroup(struct page *page,
> +static inline void set_hugetlb_cgroup(struct folio *folio,
> struct hugetlb_cgroup *h_cg)
> {
> }
>
> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
> struct hugetlb_cgroup *h_cg)
> {
> }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 27b87dc85c48..a6384fb0b57f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1758,19 +1758,21 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
> h->nr_huge_pages_node[nid]++;
> }
>
> -static void __prep_new_huge_page(struct hstate *h, struct page *page)
> +static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
> {
> - hugetlb_vmemmap_optimize(h, page);
> - INIT_LIST_HEAD(&page->lru);
> - set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> - hugetlb_set_page_subpool(page, NULL);
> - set_hugetlb_cgroup(page, NULL);
> - set_hugetlb_cgroup_rsvd(page, NULL);
> + hugetlb_vmemmap_optimize(h, &folio->page);
> + INIT_LIST_HEAD(&folio->lru);
> + folio->_folio_dtor = HUGETLB_PAGE_DTOR;

Seems like a variant of set_compound_page_dtor() for folio is missing,
e.g. set_large_folio_dtor(). It's better to add it in this series.

Thanks.

> + hugetlb_set_folio_subpool(folio, NULL);
> + set_hugetlb_cgroup(folio, NULL);
> + set_hugetlb_cgroup_rsvd(folio, NULL);
> }
>
> static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> {
> - __prep_new_huge_page(h, page);
> + struct folio *folio = page_folio(page);
> +
> + __prep_new_hugetlb_folio(h, folio);
> spin_lock_irq(&hugetlb_lock);
> __prep_account_new_huge_page(h, nid);
> spin_unlock_irq(&hugetlb_lock);
> @@ -2731,8 +2733,10 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> struct list_head *list)
> {
> gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
> - int nid = page_to_nid(old_page);
> + struct folio *old_folio = page_folio(old_page);
> + int nid = folio_nid(old_folio);
> struct page *new_page;
> + struct folio *new_folio;
> int ret = 0;
>
> /*
> @@ -2745,16 +2749,17 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL);
> if (!new_page)
> return -ENOMEM;
> - __prep_new_huge_page(h, new_page);
> + new_folio = page_folio(new_page);
> + __prep_new_hugetlb_folio(h, new_folio);
>
> retry:
> spin_lock_irq(&hugetlb_lock);
> - if (!PageHuge(old_page)) {
> + if (!folio_test_hugetlb(old_folio)) {
> /*
> * Freed from under us. Drop new_page too.
> */
> goto free_new;
> - } else if (page_count(old_page)) {
> + } else if (folio_ref_count(old_folio)) {
> /*
> * Someone has grabbed the page, try to isolate it here.
> * Fail with -EBUSY if not possible.
> @@ -2763,7 +2768,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> ret = isolate_hugetlb(old_page, list);
> spin_lock_irq(&hugetlb_lock);
> goto free_new;
> - } else if (!HPageFreed(old_page)) {
> + } else if (!folio_test_hugetlb_freed(old_folio)) {
> /*
> * Page's refcount is 0 but it has not been enqueued in the
> * freelist yet. Race window is small, so we can succeed here if
> @@ -2801,7 +2806,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> free_new:
> spin_unlock_irq(&hugetlb_lock);
> /* Page has a zero ref count, but needs a ref to be freed */
> - set_page_refcounted(new_page);
> + folio_ref_unfreeze(new_folio, 1);
> update_and_free_page(h, new_page, false);
>
> return ret;
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 600c98560a0f..692b23b5d423 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -212,7 +212,7 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
> /* Take the pages off the local counter */
> page_counter_cancel(counter, nr_pages);
>
> - set_hugetlb_cgroup(page, parent);
> + set_hugetlb_cgroup(folio, parent);
> out:
> return;
> }
> @@ -894,6 +894,7 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
> struct hugetlb_cgroup *h_cg_rsvd;
> struct hstate *h = page_hstate(oldhpage);
> struct folio *old_folio = page_folio(oldhpage);
> + struct folio *new_folio = page_folio(newhpage);
>
> if (hugetlb_cgroup_disabled())
> return;
> @@ -901,12 +902,12 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
> spin_lock_irq(&hugetlb_lock);
> h_cg = hugetlb_cgroup_from_folio(old_folio);
> h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
> - set_hugetlb_cgroup(oldhpage, NULL);
> - set_hugetlb_cgroup_rsvd(oldhpage, NULL);
> + set_hugetlb_cgroup(old_folio, NULL);
> + set_hugetlb_cgroup_rsvd(old_folio, NULL);
>
> /* move the h_cg details to new cgroup */
> - set_hugetlb_cgroup(newhpage, h_cg);
> - set_hugetlb_cgroup_rsvd(newhpage, h_cg_rsvd);
> + set_hugetlb_cgroup(new_folio, h_cg);
> + set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd);
> list_move(&newhpage->lru, &h->hugepage_activelist);
> spin_unlock_irq(&hugetlb_lock);
> return;
> --
> 2.31.1
>
>


2022-11-02 07:32:51

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Removes a call to compound_head() by using a folio when operating on the
> head page of a hugetlb compound page.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

Thanks.


2022-11-02 07:47:27

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 7/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_uncharge_page() to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Continue to use a folio inside free_huge_page() by converting
> hugetlb_cgroup_uncharge_page*() to folios.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>


Reviewed-by: Muchun Song <[email protected]>

A nit below.

> ---
> include/linux/hugetlb_cgroup.h | 16 ++++++++--------
> mm/hugetlb.c | 15 +++++++++------
> mm/hugetlb_cgroup.c | 21 ++++++++++-----------
> 3 files changed, 27 insertions(+), 25 deletions(-)
>
> diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
> index 789b6fef176d..c70f92fe493e 100644
> --- a/include/linux/hugetlb_cgroup.h
> +++ b/include/linux/hugetlb_cgroup.h
> @@ -158,10 +158,10 @@ extern void hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
> extern void hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
> struct hugetlb_cgroup *h_cg,
> struct page *page);
> -extern void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> - struct page *page);
> -extern void hugetlb_cgroup_uncharge_page_rsvd(int idx, unsigned long nr_pages,
> - struct page *page);
> +extern void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
> + struct folio *folio);
> +extern void hugetlb_cgroup_uncharge_folio_rsvd(int idx, unsigned long nr_pages,
> + struct folio *folio);
>
> extern void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
> struct hugetlb_cgroup *h_cg);
> @@ -254,14 +254,14 @@ hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
> {
> }
>
> -static inline void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> - struct page *page)
> +static inline void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
> + struct folio *folio)
> {
> }
>
> -static inline void hugetlb_cgroup_uncharge_page_rsvd(int idx,
> +static inline void hugetlb_cgroup_uncharge_folio_rsvd(int idx,
> unsigned long nr_pages,
> - struct page *page)
> + struct folio *folio)
> {
> }
> static inline void hugetlb_cgroup_uncharge_cgroup(int idx,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 387b8d74107d..2ecc0a6cf883 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1726,10 +1726,10 @@ void free_huge_page(struct page *page)
>
> spin_lock_irqsave(&hugetlb_lock, flags);
> folio_clear_hugetlb_migratable(folio);
> - hugetlb_cgroup_uncharge_page(hstate_index(h),
> - pages_per_huge_page(h), page);
> - hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
> - pages_per_huge_page(h), page);
> + hugetlb_cgroup_uncharge_folio(hstate_index(h),
> + pages_per_huge_page(h), folio);
> + hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> + pages_per_huge_page(h), folio);
> if (restore_reserve)
> h->resv_huge_pages++;
>
> @@ -2855,6 +2855,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
> struct hugepage_subpool *spool = subpool_vma(vma);
> struct hstate *h = hstate_vma(vma);
> struct page *page;
> + struct folio *folio;
> long map_chg, map_commit;
> long gbl_chg;
> int ret, idx;
> @@ -2918,6 +2919,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
> * a reservation exists for the allocation.
> */
> page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve, gbl_chg);
> +

Redundant blank line.

> if (!page) {
> spin_unlock_irq(&hugetlb_lock);
> page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
> @@ -2932,6 +2934,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
> set_page_refcounted(page);
> /* Fall through */
> }
> + folio = page_folio(page);
> hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
> /* If allocation is not consuming a reservation, also store the
> * hugetlb_cgroup pointer on the page.
> @@ -2961,8 +2964,8 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
> rsv_adjust = hugepage_subpool_put_pages(spool, 1);
> hugetlb_acct_memory(h, -rsv_adjust);
> if (deferred_reserve)
> - hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
> - pages_per_huge_page(h), page);
> + hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> + pages_per_huge_page(h), folio);
> }
> return page;
>
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 351ffb40261c..7793401acc12 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -349,11 +349,10 @@ void hugetlb_cgroup_commit_charge_rsvd(int idx, unsigned long nr_pages,
> /*
> * Should be called with hugetlb_lock held
> */
> -static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> - struct page *page, bool rsvd)
> +static void __hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
> + struct folio *folio, bool rsvd)
> {
> struct hugetlb_cgroup *h_cg;
> - struct folio *folio = page_folio(page);
>
> if (hugetlb_cgroup_disabled())
> return;
> @@ -371,27 +370,27 @@ static void __hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> css_put(&h_cg->css);
> else {
> unsigned long usage =
> - h_cg->nodeinfo[page_to_nid(page)]->usage[idx];
> + h_cg->nodeinfo[folio_nid(folio)]->usage[idx];
> /*
> * This write is not atomic due to fetching usage and writing
> * to it, but that's fine because we call this with
> * hugetlb_lock held anyway.
> */
> - WRITE_ONCE(h_cg->nodeinfo[page_to_nid(page)]->usage[idx],
> + WRITE_ONCE(h_cg->nodeinfo[folio_nid(folio)]->usage[idx],
> usage - nr_pages);
> }
> }
>
> -void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
> - struct page *page)
> +void hugetlb_cgroup_uncharge_folio(int idx, unsigned long nr_pages,
> + struct folio *folio)
> {
> - __hugetlb_cgroup_uncharge_page(idx, nr_pages, page, false);
> + __hugetlb_cgroup_uncharge_folio(idx, nr_pages, folio, false);
> }
>
> -void hugetlb_cgroup_uncharge_page_rsvd(int idx, unsigned long nr_pages,
> - struct page *page)
> +void hugetlb_cgroup_uncharge_folio_rsvd(int idx, unsigned long nr_pages,
> + struct folio *folio)
> {
> - __hugetlb_cgroup_uncharge_page(idx, nr_pages, page, true);
> + __hugetlb_cgroup_uncharge_folio(idx, nr_pages, folio, true);
> }
>
> static void __hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
> --
> 2.31.1
>
>


2022-11-02 07:48:32

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 6/9] mm/hugetlb: convert free_huge_page to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Use folios inside free_huge_page(), this is in preparation for converting
> hugetlb_cgroup_uncharge_page() to take in a folio.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

Thanks.


2022-11-02 07:49:15

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 4/9] mm/hugetlb_cgroup: convert hugetlb_cgroup_migrate to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Cleans up intermediate page to folio conversion code in
> hugetlb_cgroup_migrate() by changing its arguments from pages to folios.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

Thanks.


2022-11-02 07:50:12

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] mm/hugetlb: convert move_hugetlb_state() to folios



> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>
> Clean up unmap_and_move_huge_page() by converting move_hugetlb_state() to
> take in folios.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> Reviewed-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

A nit below.

> ---
> include/linux/hugetlb.h | 6 +++---
> mm/hugetlb.c | 22 ++++++++++++----------
> mm/migrate.c | 4 ++--
> 3 files changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index d81f139193aa..375cd57721d6 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -184,7 +184,7 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb, bool unpoison);
> int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
> bool *migratable_cleared);
> void putback_active_hugepage(struct page *page);
> -void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason);
> +void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
> void free_huge_page(struct page *page);
> void hugetlb_fix_reserve_counts(struct inode *inode);
> extern struct mutex *hugetlb_fault_mutex_table;
> @@ -440,8 +440,8 @@ static inline void putback_active_hugepage(struct page *page)
> {
> }
>
> -static inline void move_hugetlb_state(struct page *oldpage,
> - struct page *newpage, int reason)
> +static inline void move_hugetlb_state(struct folio *old_folio,
> + struct folio *new_folio, int reason)
> {
> }
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 2ecc0a6cf883..2ab8f3b7132a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7289,15 +7289,15 @@ void putback_active_hugepage(struct page *page)
> put_page(page);
> }
>
> -void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
> +void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason)
> {
> - struct hstate *h = page_hstate(oldpage);
> + struct hstate *h = folio_hstate(old_folio);
>
> - hugetlb_cgroup_migrate(page_folio(oldpage), page_folio(newpage));
> - set_page_owner_migrate_reason(newpage, reason);
> + hugetlb_cgroup_migrate(old_folio, new_folio);
> + set_page_owner_migrate_reason(&new_folio->page, reason);
>
> /*
> - * transfer temporary state of the new huge page. This is
> + * transfer temporary state of the new hugetlb folio. This is
> * reverse to other transitions because the newpage is going to
> * be final while the old one will be freed so it takes over
> * the temporary status.
> @@ -7306,12 +7306,14 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
> * here as well otherwise the global surplus count will not match
> * the per-node's.
> */
> - if (HPageTemporary(newpage)) {
> - int old_nid = page_to_nid(oldpage);
> - int new_nid = page_to_nid(newpage);
> + if (folio_test_hugetlb_temporary(new_folio)) {
> + int old_nid = folio_nid(old_folio);
> + int new_nid = folio_nid(new_folio);
> +
> +

Please remove this redundant blank line.

Thanks.

> + folio_set_hugetlb_temporary(old_folio);
> + folio_clear_hugetlb_temporary(new_folio);
>
> - SetHPageTemporary(oldpage);
> - ClearHPageTemporary(newpage);
>
> /*
> * There is no need to transfer the per-node surplus state
> diff --git a/mm/migrate.c b/mm/migrate.c
> index d7db4fd97d8e..81f9a36c754d 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1278,7 +1278,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> * folio_mapping() set, hugetlbfs specific move page routine will not
> * be called and we could leak usage counts for subpools.
> */
> - if (hugetlb_page_subpool(hpage) && !folio_mapping(src)) {
> + if (hugetlb_folio_subpool(src) && !folio_mapping(src)) {
> rc = -EBUSY;
> goto out_unlock;
> }
> @@ -1328,7 +1328,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> put_anon_vma(anon_vma);
>
> if (rc == MIGRATEPAGE_SUCCESS) {
> - move_hugetlb_state(hpage, new_hpage, reason);
> + move_hugetlb_state(src, dst, reason);
> put_new_page = NULL;
> }
>
> --
> 2.31.1
>
>


2022-11-10 00:29:26

by Sidhartha Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios


On 11/1/22 11:45 PM, Muchun Song wrote:
>
>> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>>
>> Allows __prep_new_huge_page() to operate on a folio by converting
>> set_hugetlb_cgroup*() to take in a folio.
>>
>> Signed-off-by: Sidhartha Kumar <[email protected]>
>> ---
>> include/linux/hugetlb_cgroup.h | 12 ++++++------
>> mm/hugetlb.c | 33 +++++++++++++++++++--------------
>> mm/hugetlb_cgroup.c | 11 ++++++-----
>> 3 files changed, 31 insertions(+), 25 deletions(-)
>>
>> diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
>> index feb2edafc8b6..a7e3540f7f38 100644
>> --- a/include/linux/hugetlb_cgroup.h
>> +++ b/include/linux/hugetlb_cgroup.h
>> @@ -112,16 +112,16 @@ static inline void __set_hugetlb_cgroup(struct folio *folio,
>> (unsigned long)h_cg);
>> }
>>
>> -static inline void set_hugetlb_cgroup(struct page *page,
>> +static inline void set_hugetlb_cgroup(struct folio *folio,
>> struct hugetlb_cgroup *h_cg)
>> {
>> - __set_hugetlb_cgroup(page_folio(page), h_cg, false);
>> + __set_hugetlb_cgroup(folio, h_cg, false);
>> }
>>
>> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
>> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
>> struct hugetlb_cgroup *h_cg)
>> {
>> - __set_hugetlb_cgroup(page_folio(page), h_cg, true);
>> + __set_hugetlb_cgroup(folio, h_cg, true);
>> }
>>
>> static inline bool hugetlb_cgroup_disabled(void)
>> @@ -199,12 +199,12 @@ hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
>> return NULL;
>> }
>>
>> -static inline void set_hugetlb_cgroup(struct page *page,
>> +static inline void set_hugetlb_cgroup(struct folio *folio,
>> struct hugetlb_cgroup *h_cg)
>> {
>> }
>>
>> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
>> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
>> struct hugetlb_cgroup *h_cg)
>> {
>> }
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 27b87dc85c48..a6384fb0b57f 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1758,19 +1758,21 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
>> h->nr_huge_pages_node[nid]++;
>> }
>>
>> -static void __prep_new_huge_page(struct hstate *h, struct page *page)
>> +static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
>> {
>> - hugetlb_vmemmap_optimize(h, page);
>> - INIT_LIST_HEAD(&page->lru);
>> - set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
>> - hugetlb_set_page_subpool(page, NULL);
>> - set_hugetlb_cgroup(page, NULL);
>> - set_hugetlb_cgroup_rsvd(page, NULL);
>> + hugetlb_vmemmap_optimize(h, &folio->page);
>> + INIT_LIST_HEAD(&folio->lru);
>> + folio->_folio_dtor = HUGETLB_PAGE_DTOR;
> Seems like a variant of set_compound_page_dtor() for folio is missing,
> e.g. set_large_folio_dtor(). It's better to add it in this series.
>
> Thanks.

Hi Muchun thanks for taking a look,

Would it be ok to add this functionality in a separate patch series?
Some of the earlier patches in this series were modified by Hugh's
series[1] so I'm not sure how a v3 of this series would look now. Let me
know which approach you would prefer.


Thanks,

Sidhartha Kumar

[1]
https://lore.kernel.org/linux-mm/[email protected]/

>> + hugetlb_set_folio_subpool(folio, NULL);
>> + set_hugetlb_cgroup(folio, NULL);
>> + set_hugetlb_cgroup_rsvd(folio, NULL);
>> }
>>
>> static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>> {
>> - __prep_new_huge_page(h, page);
>> + struct folio *folio = page_folio(page);
>> +
>> + __prep_new_hugetlb_folio(h, folio);
>> spin_lock_irq(&hugetlb_lock);
>> __prep_account_new_huge_page(h, nid);
>> spin_unlock_irq(&hugetlb_lock);
>> @@ -2731,8 +2733,10 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>> struct list_head *list)
>> {
>> gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
>> - int nid = page_to_nid(old_page);
>> + struct folio *old_folio = page_folio(old_page);
>> + int nid = folio_nid(old_folio);
>> struct page *new_page;
>> + struct folio *new_folio;
>> int ret = 0;
>>
>> /*
>> @@ -2745,16 +2749,17 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>> new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL);
>> if (!new_page)
>> return -ENOMEM;
>> - __prep_new_huge_page(h, new_page);
>> + new_folio = page_folio(new_page);
>> + __prep_new_hugetlb_folio(h, new_folio);
>>
>> retry:
>> spin_lock_irq(&hugetlb_lock);
>> - if (!PageHuge(old_page)) {
>> + if (!folio_test_hugetlb(old_folio)) {
>> /*
>> * Freed from under us. Drop new_page too.
>> */
>> goto free_new;
>> - } else if (page_count(old_page)) {
>> + } else if (folio_ref_count(old_folio)) {
>> /*
>> * Someone has grabbed the page, try to isolate it here.
>> * Fail with -EBUSY if not possible.
>> @@ -2763,7 +2768,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>> ret = isolate_hugetlb(old_page, list);
>> spin_lock_irq(&hugetlb_lock);
>> goto free_new;
>> - } else if (!HPageFreed(old_page)) {
>> + } else if (!folio_test_hugetlb_freed(old_folio)) {
>> /*
>> * Page's refcount is 0 but it has not been enqueued in the
>> * freelist yet. Race window is small, so we can succeed here if
>> @@ -2801,7 +2806,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>> free_new:
>> spin_unlock_irq(&hugetlb_lock);
>> /* Page has a zero ref count, but needs a ref to be freed */
>> - set_page_refcounted(new_page);
>> + folio_ref_unfreeze(new_folio, 1);
>> update_and_free_page(h, new_page, false);
>>
>> return ret;
>> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
>> index 600c98560a0f..692b23b5d423 100644
>> --- a/mm/hugetlb_cgroup.c
>> +++ b/mm/hugetlb_cgroup.c
>> @@ -212,7 +212,7 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
>> /* Take the pages off the local counter */
>> page_counter_cancel(counter, nr_pages);
>>
>> - set_hugetlb_cgroup(page, parent);
>> + set_hugetlb_cgroup(folio, parent);
>> out:
>> return;
>> }
>> @@ -894,6 +894,7 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
>> struct hugetlb_cgroup *h_cg_rsvd;
>> struct hstate *h = page_hstate(oldhpage);
>> struct folio *old_folio = page_folio(oldhpage);
>> + struct folio *new_folio = page_folio(newhpage);
>>
>> if (hugetlb_cgroup_disabled())
>> return;
>> @@ -901,12 +902,12 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
>> spin_lock_irq(&hugetlb_lock);
>> h_cg = hugetlb_cgroup_from_folio(old_folio);
>> h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
>> - set_hugetlb_cgroup(oldhpage, NULL);
>> - set_hugetlb_cgroup_rsvd(oldhpage, NULL);
>> + set_hugetlb_cgroup(old_folio, NULL);
>> + set_hugetlb_cgroup_rsvd(old_folio, NULL);
>>
>> /* move the h_cg details to new cgroup */
>> - set_hugetlb_cgroup(newhpage, h_cg);
>> - set_hugetlb_cgroup_rsvd(newhpage, h_cg_rsvd);
>> + set_hugetlb_cgroup(new_folio, h_cg);
>> + set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd);
>> list_move(&newhpage->lru, &h->hugepage_activelist);
>> spin_unlock_irq(&hugetlb_lock);
>> return;
>> --
>> 2.31.1
>>
>>
>

2022-11-10 07:59:09

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios



> On Nov 10, 2022, at 08:20, Sidhartha Kumar <[email protected]> wrote:
>
>
> On 11/1/22 11:45 PM, Muchun Song wrote:
>>
>>> On Nov 2, 2022, at 06:30, Sidhartha Kumar <[email protected]> wrote:
>>>
>>> Allows __prep_new_huge_page() to operate on a folio by converting
>>> set_hugetlb_cgroup*() to take in a folio.
>>>
>>> Signed-off-by: Sidhartha Kumar <[email protected]>
>>> ---
>>> include/linux/hugetlb_cgroup.h | 12 ++++++------
>>> mm/hugetlb.c | 33 +++++++++++++++++++--------------
>>> mm/hugetlb_cgroup.c | 11 ++++++-----
>>> 3 files changed, 31 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
>>> index feb2edafc8b6..a7e3540f7f38 100644
>>> --- a/include/linux/hugetlb_cgroup.h
>>> +++ b/include/linux/hugetlb_cgroup.h
>>> @@ -112,16 +112,16 @@ static inline void __set_hugetlb_cgroup(struct folio *folio,
>>> (unsigned long)h_cg);
>>> }
>>>
>>> -static inline void set_hugetlb_cgroup(struct page *page,
>>> +static inline void set_hugetlb_cgroup(struct folio *folio,
>>> struct hugetlb_cgroup *h_cg)
>>> {
>>> - __set_hugetlb_cgroup(page_folio(page), h_cg, false);
>>> + __set_hugetlb_cgroup(folio, h_cg, false);
>>> }
>>>
>>> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
>>> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
>>> struct hugetlb_cgroup *h_cg)
>>> {
>>> - __set_hugetlb_cgroup(page_folio(page), h_cg, true);
>>> + __set_hugetlb_cgroup(folio, h_cg, true);
>>> }
>>>
>>> static inline bool hugetlb_cgroup_disabled(void)
>>> @@ -199,12 +199,12 @@ hugetlb_cgroup_from_folio_rsvd(struct folio *folio)
>>> return NULL;
>>> }
>>>
>>> -static inline void set_hugetlb_cgroup(struct page *page,
>>> +static inline void set_hugetlb_cgroup(struct folio *folio,
>>> struct hugetlb_cgroup *h_cg)
>>> {
>>> }
>>>
>>> -static inline void set_hugetlb_cgroup_rsvd(struct page *page,
>>> +static inline void set_hugetlb_cgroup_rsvd(struct folio *folio,
>>> struct hugetlb_cgroup *h_cg)
>>> {
>>> }
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 27b87dc85c48..a6384fb0b57f 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -1758,19 +1758,21 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
>>> h->nr_huge_pages_node[nid]++;
>>> }
>>>
>>> -static void __prep_new_huge_page(struct hstate *h, struct page *page)
>>> +static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
>>> {
>>> - hugetlb_vmemmap_optimize(h, page);
>>> - INIT_LIST_HEAD(&page->lru);
>>> - set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
>>> - hugetlb_set_page_subpool(page, NULL);
>>> - set_hugetlb_cgroup(page, NULL);
>>> - set_hugetlb_cgroup_rsvd(page, NULL);
>>> + hugetlb_vmemmap_optimize(h, &folio->page);
>>> + INIT_LIST_HEAD(&folio->lru);
>>> + folio->_folio_dtor = HUGETLB_PAGE_DTOR;
>> Seems like a variant of set_compound_page_dtor() for folio is missing,
>> e.g. set_large_folio_dtor(). It's better to add it in this series.
>>
>> Thanks.
>
> Hi Muchun thanks for taking a look,
>
> Would it be ok to add this functionality in a separate patch series? Some of the earlier patches in this series were

I'm ok.

Thanks.

> modified by Hugh's series[1] so I'm not sure how a v3 of this series would look now. Let me know which approach you would prefer.
>
>
> Thanks,
>
> Sidhartha Kumar
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
>
>>> + hugetlb_set_folio_subpool(folio, NULL);
>>> + set_hugetlb_cgroup(folio, NULL);
>>> + set_hugetlb_cgroup_rsvd(folio, NULL);
>>> }
>>>
>>> static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>>> {
>>> - __prep_new_huge_page(h, page);
>>> + struct folio *folio = page_folio(page);
>>> +
>>> + __prep_new_hugetlb_folio(h, folio);
>>> spin_lock_irq(&hugetlb_lock);
>>> __prep_account_new_huge_page(h, nid);
>>> spin_unlock_irq(&hugetlb_lock);
>>> @@ -2731,8 +2733,10 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>> struct list_head *list)
>>> {
>>> gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
>>> - int nid = page_to_nid(old_page);
>>> + struct folio *old_folio = page_folio(old_page);
>>> + int nid = folio_nid(old_folio);
>>> struct page *new_page;
>>> + struct folio *new_folio;
>>> int ret = 0;
>>>
>>> /*
>>> @@ -2745,16 +2749,17 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>> new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL);
>>> if (!new_page)
>>> return -ENOMEM;
>>> - __prep_new_huge_page(h, new_page);
>>> + new_folio = page_folio(new_page);
>>> + __prep_new_hugetlb_folio(h, new_folio);
>>>
>>> retry:
>>> spin_lock_irq(&hugetlb_lock);
>>> - if (!PageHuge(old_page)) {
>>> + if (!folio_test_hugetlb(old_folio)) {
>>> /*
>>> * Freed from under us. Drop new_page too.
>>> */
>>> goto free_new;
>>> - } else if (page_count(old_page)) {
>>> + } else if (folio_ref_count(old_folio)) {
>>> /*
>>> * Someone has grabbed the page, try to isolate it here.
>>> * Fail with -EBUSY if not possible.
>>> @@ -2763,7 +2768,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>> ret = isolate_hugetlb(old_page, list);
>>> spin_lock_irq(&hugetlb_lock);
>>> goto free_new;
>>> - } else if (!HPageFreed(old_page)) {
>>> + } else if (!folio_test_hugetlb_freed(old_folio)) {
>>> /*
>>> * Page's refcount is 0 but it has not been enqueued in the
>>> * freelist yet. Race window is small, so we can succeed here if
>>> @@ -2801,7 +2806,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>> free_new:
>>> spin_unlock_irq(&hugetlb_lock);
>>> /* Page has a zero ref count, but needs a ref to be freed */
>>> - set_page_refcounted(new_page);
>>> + folio_ref_unfreeze(new_folio, 1);
>>> update_and_free_page(h, new_page, false);
>>>
>>> return ret;
>>> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
>>> index 600c98560a0f..692b23b5d423 100644
>>> --- a/mm/hugetlb_cgroup.c
>>> +++ b/mm/hugetlb_cgroup.c
>>> @@ -212,7 +212,7 @@ static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
>>> /* Take the pages off the local counter */
>>> page_counter_cancel(counter, nr_pages);
>>>
>>> - set_hugetlb_cgroup(page, parent);
>>> + set_hugetlb_cgroup(folio, parent);
>>> out:
>>> return;
>>> }
>>> @@ -894,6 +894,7 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
>>> struct hugetlb_cgroup *h_cg_rsvd;
>>> struct hstate *h = page_hstate(oldhpage);
>>> struct folio *old_folio = page_folio(oldhpage);
>>> + struct folio *new_folio = page_folio(newhpage);
>>>
>>> if (hugetlb_cgroup_disabled())
>>> return;
>>> @@ -901,12 +902,12 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct page *newhpage)
>>> spin_lock_irq(&hugetlb_lock);
>>> h_cg = hugetlb_cgroup_from_folio(old_folio);
>>> h_cg_rsvd = hugetlb_cgroup_from_folio_rsvd(old_folio);
>>> - set_hugetlb_cgroup(oldhpage, NULL);
>>> - set_hugetlb_cgroup_rsvd(oldhpage, NULL);
>>> + set_hugetlb_cgroup(old_folio, NULL);
>>> + set_hugetlb_cgroup_rsvd(old_folio, NULL);
>>>
>>> /* move the h_cg details to new cgroup */
>>> - set_hugetlb_cgroup(newhpage, h_cg);
>>> - set_hugetlb_cgroup_rsvd(newhpage, h_cg_rsvd);
>>> + set_hugetlb_cgroup(new_folio, h_cg);
>>> + set_hugetlb_cgroup_rsvd(new_folio, h_cg_rsvd);
>>> list_move(&newhpage->lru, &h->hugepage_activelist);
>>> spin_unlock_irq(&hugetlb_lock);
>>> return;
>>> --
>>> 2.31.1
>>>
>>>
>>


2022-11-10 19:58:41

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] mm/hugetlb_cgroup: convert set_hugetlb_cgroup*() to folios

On 11/01/22 15:30, Sidhartha Kumar wrote:
> Allows __prep_new_huge_page() to operate on a folio by converting
> set_hugetlb_cgroup*() to take in a folio.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> ---
> include/linux/hugetlb_cgroup.h | 12 ++++++------
> mm/hugetlb.c | 33 +++++++++++++++++++--------------
> mm/hugetlb_cgroup.c | 11 ++++++-----
> 3 files changed, 31 insertions(+), 25 deletions(-)

Muchun had the same comment I had in v1 about the need for a routine to
set the destructor of a folio. Since this will be added in a subsequent
series,

Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz

2023-06-12 18:03:17

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios

On Tue, Nov 01, 2022 at 03:30:55PM -0700, Sidhartha Kumar wrote:
> +++ b/mm/hugetlb.c
> @@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> {
> struct hstate *h;
> - struct page *head;
> + struct folio *folio = page_folio(page);

Is this safe? I was reviewing a different patch today, and I spotted
this. With THP, we can relatively easily hit this case:

struct page points to a page with pfn 0x40305, in a folio of order 2.
We call page_folio() on it and the resulting pointer is for the folio
with pfn 0x40304.
If we don't have our own refcount (or some other protection ...) against
freeing, the folio can now be freed and reallocated. Say it's now part
of an order-3 folio.
Our 'folio' pointer is now actually a pointer to a tail page, and we
have various assertions that a folio pointer doesn't point to a tail
page, so they trigger.

It seems to me that this ...

/*
* The page might have been dissolved from under our feet, so make sure
* to carefully check the state under the lock.
* Return success when racing as if we dissolved the page ourselves.
*/
spin_lock_irq(&hugetlb_lock);
if (folio_test_hugetlb(folio)) {
h = folio_hstate(folio);
} else {
spin_unlock_irq(&hugetlb_lock);
return 0;
}

implies that we don't have our own reference on the folio, so we might
find a situation where the folio pointer we have is no longer a folio
pointer.

Maybe the page_folio() call should be moved inside the hugetlb_lock
protection? Is that enough? I don't know enough about how hugetlb
pages are split, freed & allocated to know what's going on.

But then we _drop_ the lock, and keep referring to ...

> @@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> if (hstate_is_gigantic(h))
> return -ENOMEM;
>
> - if (page_count(head) && !isolate_hugetlb(head, list))
> + if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
> ret = 0;
> - else if (!page_count(head))
> - ret = alloc_and_dissolve_huge_page(h, head, list);
> + else if (!folio_ref_count(folio))
> + ret = alloc_and_dissolve_huge_page(h, &folio->page, list);

And I fall back to saying "I don't know enough to know if this is safe".

2023-06-12 19:20:58

by Sidhartha Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios

On 6/12/23 10:41 AM, Matthew Wilcox wrote:
> On Tue, Nov 01, 2022 at 03:30:55PM -0700, Sidhartha Kumar wrote:
>> +++ b/mm/hugetlb.c
>> @@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>> int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
>> {
>> struct hstate *h;
>> - struct page *head;
>> + struct folio *folio = page_folio(page);
>
> Is this safe? I was reviewing a different patch today, and I spotted
> this. With THP, we can relatively easily hit this case:
>
> struct page points to a page with pfn 0x40305, in a folio of order 2.
> We call page_folio() on it and the resulting pointer is for the folio
> with pfn 0x40304.
> If we don't have our own refcount (or some other protection ...) against
> freeing, the folio can now be freed and reallocated. Say it's now part
> of an order-3 folio.
> Our 'folio' pointer is now actually a pointer to a tail page, and we
> have various assertions that a folio pointer doesn't point to a tail
> page, so they trigger.
>
> It seems to me that this ...
>
> /*
> * The page might have been dissolved from under our feet, so make sure
> * to carefully check the state under the lock.
> * Return success when racing as if we dissolved the page ourselves.
> */
> spin_lock_irq(&hugetlb_lock);
> if (folio_test_hugetlb(folio)) {
> h = folio_hstate(folio);
> } else {
> spin_unlock_irq(&hugetlb_lock);
> return 0;
> }
>
> implies that we don't have our own reference on the folio, so we might
> find a situation where the folio pointer we have is no longer a folio
> pointer.
>

If the folio became free and reallocated would this be considered a
success? If the folio is no longer a hugetlb folio,
isolate_or_dissolve_huge_page() returns as if it dissolved the page itself.

Later in the call stack, within alloc_and_dissolve_hugetlb_folio() there is

if (!folio_test_hugetlb(old_folio)) {
/*
* Freed from under us. Drop new_folio too.
*/
goto free_new;
}

which would imply it is safe for the old_folio to have been dropped/freed.


> Maybe the page_folio() call should be moved inside the hugetlb_lock
> protection? Is that enough? I don't know enough about how hugetlb
> pages are split, freed & allocated to know what's going on.
> But then we _drop_ the lock, and keep referring to ...
>
>> @@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
>> if (hstate_is_gigantic(h))
>> return -ENOMEM;
>>
>> - if (page_count(head) && !isolate_hugetlb(head, list))
>> + if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
>> ret = 0;
>> - else if (!page_count(head))
>> - ret = alloc_and_dissolve_huge_page(h, head, list);
>> + else if (!folio_ref_count(folio))
>> + ret = alloc_and_dissolve_huge_page(h, &folio->page, list);
>
> And I fall back to saying "I don't know enough to know if this is safe".


2023-06-12 23:53:57

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios

On 06/12/23 18:41, Matthew Wilcox wrote:
> On Tue, Nov 01, 2022 at 03:30:55PM -0700, Sidhartha Kumar wrote:
> > +++ b/mm/hugetlb.c
> > @@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> > int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> > {
> > struct hstate *h;
> > - struct page *head;
> > + struct folio *folio = page_folio(page);
>
> Is this safe? I was reviewing a different patch today, and I spotted
> this. With THP, we can relatively easily hit this case:
>
> struct page points to a page with pfn 0x40305, in a folio of order 2.
> We call page_folio() on it and the resulting pointer is for the folio
> with pfn 0x40304.
> If we don't have our own refcount (or some other protection ...) against
> freeing, the folio can now be freed and reallocated. Say it's now part
> of an order-3 folio.
> Our 'folio' pointer is now actually a pointer to a tail page, and we
> have various assertions that a folio pointer doesn't point to a tail
> page, so they trigger.
>
> It seems to me that this ...
>
> /*
> * The page might have been dissolved from under our feet, so make sure
> * to carefully check the state under the lock.
> * Return success when racing as if we dissolved the page ourselves.
> */
> spin_lock_irq(&hugetlb_lock);
> if (folio_test_hugetlb(folio)) {
> h = folio_hstate(folio);
> } else {
> spin_unlock_irq(&hugetlb_lock);
> return 0;
> }
>
> implies that we don't have our own reference on the folio, so we might
> find a situation where the folio pointer we have is no longer a folio
> pointer.

Your analysis is correct.

This is not safe because we hold no locks or references. The folio
pointer obtained via page_folio(page) may not be valid when calling
folio_test_hugetlb(folio) and later.

My bad for the Reviewed-by: :(

>
> Maybe the page_folio() call should be moved inside the hugetlb_lock
> protection? Is that enough? I don't know enough about how hugetlb
> pages are split, freed & allocated to know what's going on.
>
> But then we _drop_ the lock, and keep referring to ...
>
> > @@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> > if (hstate_is_gigantic(h))
> > return -ENOMEM;
> >
> > - if (page_count(head) && !isolate_hugetlb(head, list))
> > + if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
> > ret = 0;
> > - else if (!page_count(head))
> > - ret = alloc_and_dissolve_huge_page(h, head, list);
> > + else if (!folio_ref_count(folio))
> > + ret = alloc_and_dissolve_huge_page(h, &folio->page, list);

The above was OK when using struct page instead of folio. The 'racy'
part was getting the ref count on the head page. It was OK because this
was only a check to see if we should TRY to isolate or dissolve. The
code to actually isolate or dissolve would take the appropriate locks.

I'm afraid the code is now making even more use of a potentially invalid
folio. Here is how the above now looks in v6.3:

spin_unlock_irq(&hugetlb_lock);

/*
* Fence off gigantic pages as there is a cyclic dependency between
* alloc_contig_range and them. Return -ENOMEM as this has the effect
* of bailing out right away without further retrying.
*/
if (hstate_is_gigantic(h))
return -ENOMEM;

if (folio_ref_count(folio) && isolate_hugetlb(folio, list))
ret = 0;
else if (!folio_ref_count(folio))
ret = alloc_and_dissolve_hugetlb_folio(h, folio, list);

Looks like that potentially invalid folio is being passed to other
routines. Previous code would take lock and revalidate that struct page
was still a hugetlb page. We can not do the same with a folio.
--
Mike Kravetz

2023-06-13 23:43:17

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm/hugetlb: convert isolate_or_dissolve_huge_page to folios

On 06/12/23 16:34, Mike Kravetz wrote:
> On 06/12/23 18:41, Matthew Wilcox wrote:
> > On Tue, Nov 01, 2022 at 03:30:55PM -0700, Sidhartha Kumar wrote:
> > > +++ b/mm/hugetlb.c
> > > @@ -2815,7 +2815,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
> > > int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> > > {
> > > struct hstate *h;
> > > - struct page *head;
> > > + struct folio *folio = page_folio(page);
> >
> > Is this safe? I was reviewing a different patch today, and I spotted
> > this. With THP, we can relatively easily hit this case:
> >
> > struct page points to a page with pfn 0x40305, in a folio of order 2.
> > We call page_folio() on it and the resulting pointer is for the folio
> > with pfn 0x40304.
> > If we don't have our own refcount (or some other protection ...) against
> > freeing, the folio can now be freed and reallocated. Say it's now part
> > of an order-3 folio.
> > Our 'folio' pointer is now actually a pointer to a tail page, and we
> > have various assertions that a folio pointer doesn't point to a tail
> > page, so they trigger.
> >
> > It seems to me that this ...
> >
> > /*
> > * The page might have been dissolved from under our feet, so make sure
> > * to carefully check the state under the lock.
> > * Return success when racing as if we dissolved the page ourselves.
> > */
> > spin_lock_irq(&hugetlb_lock);
> > if (folio_test_hugetlb(folio)) {
> > h = folio_hstate(folio);
> > } else {
> > spin_unlock_irq(&hugetlb_lock);
> > return 0;
> > }
> >
> > implies that we don't have our own reference on the folio, so we might
> > find a situation where the folio pointer we have is no longer a folio
> > pointer.
>
> Your analysis is correct.
>
> This is not safe because we hold no locks or references. The folio
> pointer obtained via page_folio(page) may not be valid when calling
> folio_test_hugetlb(folio) and later.
>
> My bad for the Reviewed-by: :(
>

I was looking at this more closely and need a bit of clarification. As
mentioned, your analysis is correct. However, it appears that there is
other code doing:

folio = page_folio(page);
...
if (folio_test_hugetlb(folio))

without holding a folio ref or some type of lock. split_huge_pages_all()
is one such example.

So, either this code has the same issue or there are folio routines that
can be called without holding a ref/lock. The kerneldoc for
folio_test_hugetlb says "Caller should have a reference on the folio to
prevent it from being turned into a tail page.". However, is that mostly
to make sure the returned value is consistent/valid? Can it really lead
to an assert if folio pointer is changed to point to something else?

> > Maybe the page_folio() call should be moved inside the hugetlb_lock
> > protection? Is that enough? I don't know enough about how hugetlb
> > pages are split, freed & allocated to know what's going on.

Upon further thought, I think we should move the page_folio() inside the
lock just to be more correct.

> >
> > But then we _drop_ the lock, and keep referring to ...
> >
> > > @@ -2841,10 +2840,10 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
> > > if (hstate_is_gigantic(h))
> > > return -ENOMEM;
> > >
> > > - if (page_count(head) && !isolate_hugetlb(head, list))
> > > + if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
> > > ret = 0;
> > > - else if (!page_count(head))
> > > - ret = alloc_and_dissolve_huge_page(h, head, list);
> > > + else if (!folio_ref_count(folio))
> > > + ret = alloc_and_dissolve_huge_page(h, &folio->page, list);
>
> The above was OK when using struct page instead of folio. The 'racy'
> part was getting the ref count on the head page. It was OK because this
> was only a check to see if we should TRY to isolate or dissolve. The
> code to actually isolate or dissolve would take the appropriate locks.

page_count() is doing 'folio_ref_count(page_folio(page));' and there I suspect
there are many places doing page_count without taking a page ref or locking.
So, it seems like this would also be safe?

> I'm afraid the code is now making even more use of a potentially invalid
> folio. Here is how the above now looks in v6.3:
>
> spin_unlock_irq(&hugetlb_lock);
>
> /*
> * Fence off gigantic pages as there is a cyclic dependency between
> * alloc_contig_range and them. Return -ENOMEM as this has the effect
> * of bailing out right away without further retrying.
> */
> if (hstate_is_gigantic(h))
> return -ENOMEM;
>
> if (folio_ref_count(folio) && isolate_hugetlb(folio, list))
> ret = 0;
> else if (!folio_ref_count(folio))
> ret = alloc_and_dissolve_hugetlb_folio(h, folio, list);
>
> Looks like that potentially invalid folio is being passed to other
> routines. Previous code would take lock and revalidate that struct page
> was still a hugetlb page. We can not do the same with a folio.

Perhaps I spoke too soon. Yes, we pass a potentially invalid folio
pointer to isolate_hugetlb() and alloc_and_dissolve_hugetlb_folio().
However, it seems the validation they perform should be sufficient.

bool isolate_hugetlb(struct folio *folio, struct list_head *list)
{
bool ret = true;

spin_lock_irq(&hugetlb_lock);
if (!folio_test_hugetlb(folio) ||
!folio_test_hugetlb_migratable(folio) ||
!folio_try_get(folio)) {
ret = false;
goto unlock;


static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
struct folio *old_folio, struct list_head *list)
{
...
retry:
spin_lock_irq(&hugetlb_lock);
if (!folio_test_hugetlb(old_folio)) {
...
} else if (folio_ref_count(old_folio)) {
...
} else if (!folio_test_hugetlb_freed(old_folio)) {
...
goto retry;
} else {
/*
* Ok, old_folio is still a genuine free hugepage.

Upon further consideration, I do not see an issue with the existing
code. If there are issues with calling folio_test_hugetlb() or
folio_ref_count() on a potentially invalid folio pointer, then we do
have issues here. However, such an issue would be more widespread as
there is more code doing the same.
--
Mike Kravetz