This patch series converts migrate_page_add() and queue_pages_required()
to migrate_folio_add() and queue_page_required(). It also converts the
callers of the functions to use folios as well, and introduces a helper
function to estimate a folio's mapcount.
---
v2:
- Introduce folio_estimated_mapcount() to replace page_mapcount() in
migrate_page_add() and queue_pages_hugetlb().
- Elaborate on the comments to make it clear what the mapcount check is
for and why it is being done this way.
Vishal Moola (Oracle) (6):
mm: Add folio_estimated_mapcount()
mm/mempolicy: Convert queue_pages_pmd() to queue_folios_pmd()
mm/mempolicy: Convert queue_pages_pte_range() to
queue_folios_pte_range()
mm/mempolicy: Convert queue_pages_hugetlb() to queue_folios_hugetlb()
mm/mempolicy: Convert queue_pages_required() to queue_folio_required()
mm/mempolicy: Convert migrate_page_add() to migrate_folio_add()
include/linux/mm.h | 5 ++
mm/mempolicy.c | 122 ++++++++++++++++++++++++---------------------
2 files changed, 70 insertions(+), 57 deletions(-)
--
2.38.1
folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
the first page of that folio.
This is necessary for folio conversions where we only care about either the
entire_mapcount of a large folio, or the mapcount of a not large folio.
This is in contrast to folio_mapcount() which calculates the total
number of the times a folio and its subpages are mapped.
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
include/linux/mm.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9db257f09b3..543c360f7ecc 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
return mapcount;
}
+static inline int folio_estimated_mapcount(struct folio *folio)
+{
+ return page_mapcount(folio_page(folio, 0));
+}
+
int folio_total_mapcount(struct folio *folio);
/**
--
2.38.1
The function now operates on a folio instead of
the page associated with a pmd.
This change is in preparation for the conversion of
queue_pages_required() to queue_folio_required() and migrate_page_add()
to migrate_folio_add().
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/mempolicy.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index fd99d303e34f..00fffa93adae 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -442,21 +442,21 @@ static inline bool queue_pages_required(struct page *page,
}
/*
- * queue_pages_pmd() has three possible return values:
- * 0 - pages are placed on the right node or queued successfully, or
+ * queue_folios_pmd() has three possible return values:
+ * 0 - folios are placed on the right node or queued successfully, or
* special page is met, i.e. huge zero page.
- * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * 1 - there is unmovable folio, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
* specified.
* -EIO - is migration entry or only MPOL_MF_STRICT was specified and an
- * existing page was already on a node that does not follow the
+ * existing folio was already on a node that does not follow the
* policy.
*/
-static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
+static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
unsigned long end, struct mm_walk *walk)
__releases(ptl)
{
int ret = 0;
- struct page *page;
+ struct folio *folio;
struct queue_pages *qp = walk->private;
unsigned long flags;
@@ -464,19 +464,19 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
ret = -EIO;
goto unlock;
}
- page = pmd_page(*pmd);
- if (is_huge_zero_page(page)) {
+ folio = pfn_folio(pmd_pfn(*pmd));
+ if (is_huge_zero_page(&folio->page)) {
walk->action = ACTION_CONTINUE;
goto unlock;
}
- if (!queue_pages_required(page, qp))
+ if (!queue_pages_required(&folio->page, qp))
goto unlock;
flags = qp->flags;
- /* go to thp migration */
+ /* go to folio migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma) ||
- migrate_page_add(page, qp->pagelist, flags)) {
+ migrate_page_add(&folio->page, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
@@ -512,7 +512,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
ptl = pmd_trans_huge_lock(pmd, vma);
if (ptl)
- return queue_pages_pmd(pmd, ptl, addr, end, walk);
+ return queue_folios_pmd(pmd, ptl, addr, end, walk);
if (pmd_trans_unstable(pmd))
return 0;
--
2.38.1
This function now operates on folios associated with ptes instead of
pages.
This change is in preparation for the conversion of
queue_pages_required() to queue_folio_required() and migrate_page_add()
to migrate_folio_add().
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/mempolicy.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 00fffa93adae..ae9d16124f45 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -491,19 +491,19 @@ static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
* Scan through pages checking if pages follow certain conditions,
* and move them to the pagelist if they do.
*
- * queue_pages_pte_range() has three possible return values:
- * 0 - pages are placed on the right node or queued successfully, or
+ * queue_folios_pte_range() has three possible return values:
+ * 0 - folios are placed on the right node or queued successfully, or
* special page is met, i.e. zero page.
- * 1 - there is unmovable page, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
+ * 1 - there is unmovable folio, and MPOL_MF_MOVE* & MPOL_MF_STRICT were
* specified.
- * -EIO - only MPOL_MF_STRICT was specified and an existing page was already
+ * -EIO - only MPOL_MF_STRICT was specified and an existing folio was already
* on a node that does not follow the policy.
*/
-static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
+static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
{
struct vm_area_struct *vma = walk->vma;
- struct page *page;
+ struct folio *folio;
struct queue_pages *qp = walk->private;
unsigned long flags = qp->flags;
bool has_unmovable = false;
@@ -521,16 +521,16 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
- page = vm_normal_page(vma, addr, *pte);
- if (!page || is_zone_device_page(page))
+ folio = vm_normal_folio(vma, addr, *pte);
+ if (!folio || folio_is_zone_device(folio))
continue;
/*
- * vm_normal_page() filters out zero pages, but there might
- * still be PageReserved pages to skip, perhaps in a VDSO.
+ * vm_normal_folio() filters out zero pages, but there might
+ * still be reserved folios to skip, perhaps in a VDSO.
*/
- if (PageReserved(page))
+ if (folio_test_reserved(folio))
continue;
- if (!queue_pages_required(page, qp))
+ if (!queue_pages_required(&folio->page, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
/* MPOL_MF_STRICT must be specified if we get here */
@@ -544,7 +544,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
* temporary off LRU pages in the range. Still
* need migrate other LRU pages.
*/
- if (migrate_page_add(page, qp->pagelist, flags))
+ if (migrate_page_add(&folio->page, qp->pagelist, flags))
has_unmovable = true;
} else
break;
@@ -703,7 +703,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
static const struct mm_walk_ops queue_pages_walk_ops = {
.hugetlb_entry = queue_pages_hugetlb,
- .pmd_entry = queue_pages_pte_range,
+ .pmd_entry = queue_folios_pte_range,
.test_walk = queue_pages_test_walk,
};
--
2.38.1
This change is in preparation for the conversion of
queue_pages_required() to queue_folio_required() and migrate_page_add()
to migrate_folio_add().
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/mempolicy.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ae9d16124f45..ea8cac447e04 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -558,7 +558,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
return addr != end ? -EIO : 0;
}
-static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
+static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long end,
struct mm_walk *walk)
{
@@ -566,7 +566,7 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
#ifdef CONFIG_HUGETLB_PAGE
struct queue_pages *qp = walk->private;
unsigned long flags = (qp->flags & MPOL_MF_VALID);
- struct page *page;
+ struct folio *folio;
spinlock_t *ptl;
pte_t entry;
@@ -574,13 +574,13 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
entry = huge_ptep_get(pte);
if (!pte_present(entry))
goto unlock;
- page = pte_page(entry);
- if (!queue_pages_required(page, qp))
+ folio = pfn_folio(pte_pfn(entry));
+ if (!queue_pages_required(&folio->page, qp))
goto unlock;
if (flags == MPOL_MF_STRICT) {
/*
- * STRICT alone means only detecting misplaced page and no
+ * STRICT alone means only detecting misplaced folio and no
* need to further check other vma.
*/
ret = -EIO;
@@ -591,20 +591,27 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/*
* Must be STRICT with MOVE*, otherwise .test_walk() have
* stopped walking current vma.
- * Detecting misplaced page but allow migrating pages which
+ * Detecting misplaced folio but allow migrating folios which
* have been queued.
*/
ret = 1;
goto unlock;
}
- /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
+ /*
+ * With MPOL_MF_MOVE, we try to migrate only unshared folios. If it
+ * is shared it is likely not worth migrating.
+ *
+ * To check if the folio is shared, ideally we want to make sure
+ * every page is mapped to the same process. Doing that is very
+ * expensive, so check the estimated mapcount of the folio instead.
+ */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
- if (isolate_hugetlb(page_folio(page), qp->pagelist) &&
+ (flags & MPOL_MF_MOVE && folio_estimated_mapcount(folio) == 1)) {
+ if (isolate_hugetlb(folio, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
- * Failed to isolate page but allow migrating pages
+ * Failed to isolate folio but allow migrating folios
* which have been queued.
*/
ret = 1;
@@ -702,7 +709,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
}
static const struct mm_walk_ops queue_pages_walk_ops = {
- .hugetlb_entry = queue_pages_hugetlb,
+ .hugetlb_entry = queue_folios_hugetlb,
.pmd_entry = queue_folios_pte_range,
.test_walk = queue_pages_test_walk,
};
--
2.38.1
Replace queue_pages_required() with queue_folio_required().
queue_folio_required() does the same as queue_pages_required(), except
takes in a folio instead of a page.
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/mempolicy.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ea8cac447e04..da87644430e3 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -427,15 +427,15 @@ struct queue_pages {
};
/*
- * Check if the page's nid is in qp->nmask.
+ * Check if the folio's nid is in qp->nmask.
*
* If MPOL_MF_INVERT is set in qp->flags, check if the nid is
* in the invert of qp->nmask.
*/
-static inline bool queue_pages_required(struct page *page,
+static inline bool queue_folio_required(struct folio *folio,
struct queue_pages *qp)
{
- int nid = page_to_nid(page);
+ int nid = folio_nid(folio);
unsigned long flags = qp->flags;
return node_isset(nid, *qp->nmask) == !(flags & MPOL_MF_INVERT);
@@ -469,7 +469,7 @@ static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
walk->action = ACTION_CONTINUE;
goto unlock;
}
- if (!queue_pages_required(&folio->page, qp))
+ if (!queue_folio_required(folio, qp))
goto unlock;
flags = qp->flags;
@@ -530,7 +530,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
*/
if (folio_test_reserved(folio))
continue;
- if (!queue_pages_required(&folio->page, qp))
+ if (!queue_folio_required(folio, qp))
continue;
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
/* MPOL_MF_STRICT must be specified if we get here */
@@ -575,7 +575,7 @@ static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask,
if (!pte_present(entry))
goto unlock;
folio = pfn_folio(pte_pfn(entry));
- if (!queue_pages_required(&folio->page, qp))
+ if (!queue_folio_required(folio, qp))
goto unlock;
if (flags == MPOL_MF_STRICT) {
--
2.38.1
Replace migrate_page_add() with migrate_folio_add().
migrate_folio_add() does the same a migrate_page_add() but takes in a
folio instead of a page. This removes a couple of calls to
compound_head().
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/mempolicy.c | 39 ++++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index da87644430e3..9bb4600c4294 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -414,7 +414,7 @@ static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
},
};
-static int migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_folio_add(struct folio *folio, struct list_head *foliolist,
unsigned long flags);
struct queue_pages {
@@ -476,7 +476,7 @@ static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
/* go to folio migration */
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
if (!vma_migratable(walk->vma) ||
- migrate_page_add(&folio->page, qp->pagelist, flags)) {
+ migrate_folio_add(folio, qp->pagelist, flags)) {
ret = 1;
goto unlock;
}
@@ -544,7 +544,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
* temporary off LRU pages in the range. Still
* need migrate other LRU pages.
*/
- if (migrate_page_add(&folio->page, qp->pagelist, flags))
+ if (migrate_folio_add(folio, qp->pagelist, flags))
has_unmovable = true;
} else
break;
@@ -1029,27 +1029,28 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
}
#ifdef CONFIG_MIGRATION
-/*
- * page migration, thp tail pages can be passed.
- */
-static int migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_folio_add(struct folio *folio, struct list_head *foliolist,
unsigned long flags)
{
- struct page *head = compound_head(page);
/*
- * Avoid migrating a page that is shared with others.
+ * We try to migrate only unshared folios. If it is shared it
+ * is likely not worth migrating.
+ *
+ * To check if the folio is shared, ideally we want to make sure
+ * every page is mapped to the same process. Doing that is very
+ * expensive, so check the estimated mapcount of the folio instead.
*/
- if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) {
- if (!isolate_lru_page(head)) {
- list_add_tail(&head->lru, pagelist);
- mod_node_page_state(page_pgdat(head),
- NR_ISOLATED_ANON + page_is_file_lru(head),
- thp_nr_pages(head));
+ if ((flags & MPOL_MF_MOVE_ALL) || folio_estimated_mapcount(folio) == 1) {
+ if (!folio_isolate_lru(folio)) {
+ list_add_tail(&folio->lru, foliolist);
+ node_stat_mod_folio(folio,
+ NR_ISOLATED_ANON + folio_is_file_lru(folio),
+ folio_nr_pages(folio));
} else if (flags & MPOL_MF_STRICT) {
/*
- * Non-movable page may reach here. And, there may be
- * temporary off LRU pages or non-LRU movable pages.
- * Treat them as unmovable pages since they can't be
+ * Non-movable folio may reach here. And, there may be
+ * temporary off LRU folios or non-LRU movable folios.
+ * Treat them as unmovable folios since they can't be
* isolated, so they can't be moved at the moment. It
* should return -EIO for this case too.
*/
@@ -1241,7 +1242,7 @@ static struct page *new_page(struct page *page, unsigned long start)
}
#else
-static int migrate_page_add(struct page *page, struct list_head *pagelist,
+static int migrate_folio_add(struct folio *folio, struct list_head *foliolist,
unsigned long flags)
{
return -EIO;
--
2.38.1
Hi Vishal,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on next-20230123]
[cannot apply to linus/master v6.2-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Vishal-Moola-Oracle/mm-Add-folio_estimated_mapcount/20230124-092349
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20230124012210.13963-2-vishal.moola%40gmail.com
patch subject: [PATCH mm-unstable v2 1/6] mm: Add folio_estimated_mapcount()
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20230124/[email protected]/config)
compiler: alpha-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/2ec1fab96da69cd5e71330186987468d7d1a2595
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Vishal-Moola-Oracle/mm-Add-folio_estimated_mapcount/20230124-092349
git checkout 2ec1fab96da69cd5e71330186987468d7d1a2595
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha prepare
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from arch/alpha/include/asm/page.h:93,
from include/linux/shm.h:6,
from include/linux/sched.h:16,
from arch/alpha/kernel/asm-offsets.c:10:
include/linux/mm.h: In function 'folio_estimated_mapcount':
>> include/asm-generic/memory_model.h:35:21: error: implicit declaration of function 'page_to_section'; did you mean 'present_section'? [-Werror=implicit-function-declaration]
35 | int __sec = page_to_section(__pg); \
| ^~~~~~~~~~~~~~~
include/asm-generic/memory_model.h:40:32: note: in definition of macro '__pfn_to_page'
40 | ({ unsigned long __pfn = (pfn); \
| ^~~
include/asm-generic/memory_model.h:52:21: note: in expansion of macro '__page_to_pfn'
52 | #define page_to_pfn __page_to_pfn
| ^~~~~~~~~~~~~
include/linux/mm.h:216:38: note: in expansion of macro 'page_to_pfn'
216 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
| ^~~~~~~~~~~
include/linux/page-flags.h:286:33: note: in expansion of macro 'nth_page'
286 | #define folio_page(folio, n) nth_page(&(folio)->page, n)
| ^~~~~~~~
include/linux/mm.h:918:30: note: in expansion of macro 'folio_page'
918 | return page_mapcount(folio_page(folio, 0));
| ^~~~~~~~~~
In file included from include/linux/pid_namespace.h:7,
from include/linux/ptrace.h:10,
from arch/alpha/kernel/asm-offsets.c:11:
include/linux/mm.h: At top level:
>> include/linux/mm.h:1626:29: error: conflicting types for 'page_to_section'; have 'long unsigned int(const struct page *)'
1626 | static inline unsigned long page_to_section(const struct page *page)
| ^~~~~~~~~~~~~~~
include/asm-generic/memory_model.h:35:21: note: previous implicit declaration of 'page_to_section' with type 'int()'
35 | int __sec = page_to_section(__pg); \
| ^~~~~~~~~~~~~~~
include/asm-generic/memory_model.h:40:32: note: in definition of macro '__pfn_to_page'
40 | ({ unsigned long __pfn = (pfn); \
| ^~~
include/asm-generic/memory_model.h:52:21: note: in expansion of macro '__page_to_pfn'
52 | #define page_to_pfn __page_to_pfn
| ^~~~~~~~~~~~~
include/linux/mm.h:216:38: note: in expansion of macro 'page_to_pfn'
216 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
| ^~~~~~~~~~~
include/linux/page-flags.h:286:33: note: in expansion of macro 'nth_page'
286 | #define folio_page(folio, n) nth_page(&(folio)->page, n)
| ^~~~~~~~
include/linux/mm.h:918:30: note: in expansion of macro 'folio_page'
918 | return page_mapcount(folio_page(folio, 0));
| ^~~~~~~~~~
arch/alpha/kernel/asm-offsets.c:15:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
15 | void foo(void)
| ^~~
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:114: arch/alpha/kernel/asm-offsets.s] Error 1
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:1286: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:242: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +1626 include/linux/mm.h
bf4e8902ee5080 Daniel Kiper 2011-05-24 1625
aa462abe8aaf21 Ian Campbell 2011-08-17 @1626 static inline unsigned long page_to_section(const struct page *page)
d41dee369bff3b Andy Whitcroft 2005-06-23 1627 {
d41dee369bff3b Andy Whitcroft 2005-06-23 1628 return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
d41dee369bff3b Andy Whitcroft 2005-06-23 1629 }
308c05e35e3517 Christoph Lameter 2008-04-28 1630 #endif
d41dee369bff3b Andy Whitcroft 2005-06-23 1631
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On 24.01.23 02:22, Vishal Moola (Oracle) wrote:
> folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
> the first page of that folio.
>
> This is necessary for folio conversions where we only care about either the
> entire_mapcount of a large folio, or the mapcount of a not large folio.
>
> This is in contrast to folio_mapcount() which calculates the total
> number of the times a folio and its subpages are mapped.
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> include/linux/mm.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index c9db257f09b3..543c360f7ecc 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
> return mapcount;
> }
>
> +static inline int folio_estimated_mapcount(struct folio *folio)
> +{
> + return page_mapcount(folio_page(folio, 0));
> +}
> +
> int folio_total_mapcount(struct folio *folio);
>
> /**
I'm sorry, but "estimated" as absolutely unclear semantics. You could
have a THP mapped into 9999 processes using THP and the estimation would
be "0".
Huh? Absolutely unclear and confusing. No thanks.
--
Thanks,
David / dhildenb
On 25.01.23 11:20, David Hildenbrand wrote:
> On 24.01.23 02:22, Vishal Moola (Oracle) wrote:
>> folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
>> the first page of that folio.
>>
>> This is necessary for folio conversions where we only care about either the
>> entire_mapcount of a large folio, or the mapcount of a not large folio.
>>
>> This is in contrast to folio_mapcount() which calculates the total
>> number of the times a folio and its subpages are mapped.
>>
>> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
>> ---
>> include/linux/mm.h | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index c9db257f09b3..543c360f7ecc 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
>> return mapcount;
>> }
>>
>> +static inline int folio_estimated_mapcount(struct folio *folio)
>> +{
>> + return page_mapcount(folio_page(folio, 0));
>> +}
>> +
>> int folio_total_mapcount(struct folio *folio);
>>
>> /**
>
> I'm sorry, but "estimated" as absolutely unclear semantics. You could
> have a THP mapped into 9999 processes using THP and the estimation would
> be "0".
... or would it be 9999 ? What about a PMD-mapped THP? What about a
partially unmapped THP?
What are we estimating?
--
Thanks,
David / dhildenb
On 25.01.23 11:24, David Hildenbrand wrote:
> On 25.01.23 11:20, David Hildenbrand wrote:
>> On 24.01.23 02:22, Vishal Moola (Oracle) wrote:
>>> folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
>>> the first page of that folio.
>>>
>>> This is necessary for folio conversions where we only care about either the
>>> entire_mapcount of a large folio, or the mapcount of a not large folio.
>>>
>>> This is in contrast to folio_mapcount() which calculates the total
>>> number of the times a folio and its subpages are mapped.
>>>
>>> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
>>> ---
>>> include/linux/mm.h | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index c9db257f09b3..543c360f7ecc 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
>>> return mapcount;
>>> }
>>>
>>> +static inline int folio_estimated_mapcount(struct folio *folio)
>>> +{
>>> + return page_mapcount(folio_page(folio, 0));
>>> +}
>>> +
>>> int folio_total_mapcount(struct folio *folio);
>>>
>>> /**
>>
>> I'm sorry, but "estimated" as absolutely unclear semantics. You could
>> have a THP mapped into 9999 processes using THP and the estimation would
>> be "0".
>
> ... or would it be 9999 ? What about a PMD-mapped THP? What about a
> partially unmapped THP?
>
> What are we estimating?
Thinking about mapcounts again, might not have been my smartest moment.
What we return here is the precise number of times the first subpage is
mapped (via the large folio and directly). That's supposed to be an
estimate for the number of times any subpage part of the folio is mapped.
I really don't know a better name, but folio_estimated_mapcount() does
not feel completely right to me and triggere dmy confusion in the first
place ... hm ...
--
Thanks,
David / dhildenb
On Wed, Jan 25, 2023 at 1:29 PM David Hildenbrand <[email protected]> wrote:
>
> On 25.01.23 11:24, David Hildenbrand wrote:
> > On 25.01.23 11:20, David Hildenbrand wrote:
> >> On 24.01.23 02:22, Vishal Moola (Oracle) wrote:
> >>> folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
> >>> the first page of that folio.
> >>>
> >>> This is necessary for folio conversions where we only care about either the
> >>> entire_mapcount of a large folio, or the mapcount of a not large folio.
> >>>
> >>> This is in contrast to folio_mapcount() which calculates the total
> >>> number of the times a folio and its subpages are mapped.
> >>>
> >>> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> >>> ---
> >>> include/linux/mm.h | 5 +++++
> >>> 1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>> index c9db257f09b3..543c360f7ecc 100644
> >>> --- a/include/linux/mm.h
> >>> +++ b/include/linux/mm.h
> >>> @@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
> >>> return mapcount;
> >>> }
> >>>
> >>> +static inline int folio_estimated_mapcount(struct folio *folio)
> >>> +{
> >>> + return page_mapcount(folio_page(folio, 0));
> >>> +}
> >>> +
> >>> int folio_total_mapcount(struct folio *folio);
> >>>
> >>> /**
> >>
> >> I'm sorry, but "estimated" as absolutely unclear semantics. You could
> >> have a THP mapped into 9999 processes using THP and the estimation would
> >> be "0".
> >
> > ... or would it be 9999 ? What about a PMD-mapped THP? What about a
> > partially unmapped THP?
> >
> > What are we estimating?
>
> Thinking about mapcounts again, might not have been my smartest moment.
>
> What we return here is the precise number of times the first subpage is
> mapped (via the large folio and directly). That's supposed to be an
> estimate for the number of times any subpage part of the folio is mapped.
>
> I really don't know a better name, but folio_estimated_mapcount() does
> not feel completely right to me and triggere dmy confusion in the first
> place ... hm ...
I can understand the confusion, but I can't think of a better name
either myself. I'll go ahead and add a comment to make the purpose
of this function more clear. Looks like I'll have to move it to get rid
of the build warnings/errors anyway.
On 25.01.23 23:09, Vishal Moola wrote:
> On Wed, Jan 25, 2023 at 1:29 PM David Hildenbrand <[email protected]> wrote:
>>
>> On 25.01.23 11:24, David Hildenbrand wrote:
>>> On 25.01.23 11:20, David Hildenbrand wrote:
>>>> On 24.01.23 02:22, Vishal Moola (Oracle) wrote:
>>>>> folio_estimated_mapcount() takes in a folio and calls page_mapcount() on
>>>>> the first page of that folio.
>>>>>
>>>>> This is necessary for folio conversions where we only care about either the
>>>>> entire_mapcount of a large folio, or the mapcount of a not large folio.
>>>>>
>>>>> This is in contrast to folio_mapcount() which calculates the total
>>>>> number of the times a folio and its subpages are mapped.
>>>>>
>>>>> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
>>>>> ---
>>>>> include/linux/mm.h | 5 +++++
>>>>> 1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index c9db257f09b3..543c360f7ecc 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -875,6 +875,11 @@ static inline int page_mapcount(struct page *page)
>>>>> return mapcount;
>>>>> }
>>>>>
>>>>> +static inline int folio_estimated_mapcount(struct folio *folio)
>>>>> +{
>>>>> + return page_mapcount(folio_page(folio, 0));
>>>>> +}
>>>>> +
>>>>> int folio_total_mapcount(struct folio *folio);
>>>>>
>>>>> /**
>>>>
>>>> I'm sorry, but "estimated" as absolutely unclear semantics. You could
>>>> have a THP mapped into 9999 processes using THP and the estimation would
>>>> be "0".
>>>
>>> ... or would it be 9999 ? What about a PMD-mapped THP? What about a
>>> partially unmapped THP?
>>>
>>> What are we estimating?
>>
>> Thinking about mapcounts again, might not have been my smartest moment.
>>
>> What we return here is the precise number of times the first subpage is
>> mapped (via the large folio and directly). That's supposed to be an
>> estimate for the number of times any subpage part of the folio is mapped.
>>
>> I really don't know a better name, but folio_estimated_mapcount() does
>> not feel completely right to me and triggere dmy confusion in the first
>> place ... hm ...
>
> I can understand the confusion, but I can't think of a better name
> either myself. I'll go ahead and add a comment to make the purpose
> of this function more clear. Looks like I'll have to move it to get rid
> of the build warnings/errors anyway.
The issue is that we're not estimating the mapcount of the folio, so the
name is very misleading ... I think you really want to avoid the term
mapcount completely in that context. We're just using the #mappings of
the first subpage to determine something differently.
Thinking about it, I guess "folio_estimated_sharers()" might be what we
actually want to name it. Then you can comment how we estimate sharers
by looking at into how many page tables the first subpage is currently
mapped, and assume the same holds true for the other subpages.
It's unreliable because other subpages might behave differently, we
might not be holding the pagelock to stabilize, and we're not looking at
indirect mappings via the swapcache. But it's a comapratively good
estimate for most scenarios I guess.
--
Thanks,
David / dhildenb
On 1/26/2023 12:37 AM, David Hildenbrand wrote:
> On 25.01.23 23:09, Vishal Moola wrote:
[..]
>
> The issue is that we're not estimating the mapcount of the folio, so the
> name is very misleading ... I think you really want to avoid the term
> mapcount completely in that context. We're just using the #mappings of
> the first subpage to determine something differently.
>
> Thinking about it, I guess "folio_estimated_sharers()" might be what we
> actually want to name it. Then you can comment how we estimate sharers
> by looking at into how many page tables the first subpage is currently
> mapped, and assume the same holds true for the other subpages.
>
> It's unreliable because other subpages might behave differently, we
> might not be holding the pagelock to stabilize, and we're not looking at
> indirect mappings via the swapcache. But it's a comapratively good
> estimate for most scenarios I guess.
>
Hmm, how about simply call it folio_hpage_mapcount(),
folio_firstpage_mapcount(), or, folio_cover_mapcount() ?
It is used to replace page_mapcount() in that sense -
https://lore.kernel.org/linux-mm/Y9MDJuPWsk9820xD@x1n/T/#me0531cfb9e82ad5ca88804c727d69cc6b9b33ffa
if (flags & (MPOL_MF_MOVE_ALL) ||
(flags & MPOL_MF_MOVE && folio_estimated_mapcount(folio) == 1 &&
!hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(folio, qp->pagelist) &&
thanks,
-jane
On 1/26/2023 4:37 PM, David Hildenbrand wrote:
> Thinking about it, I guess "folio_estimated_sharers()" might be what we actually want to name it. Then you can comment how we estimate sharers by looking at into how many page tables the first subpage is currently mapped, and assume the same holds true for the other subpages.
Vote for 'folio_estimated_sharers()'. If better method
other than checking mapcount is found in the future, it's
easy to update the implementation without change the API
name.
Regards
Yin, Fengwei
On 28.01.23 01:48, Jane Chu wrote:
> On 1/26/2023 12:37 AM, David Hildenbrand wrote:
>> On 25.01.23 23:09, Vishal Moola wrote:
> [..]
>>
>> The issue is that we're not estimating the mapcount of the folio, so the
>> name is very misleading ... I think you really want to avoid the term
>> mapcount completely in that context. We're just using the #mappings of
>> the first subpage to determine something differently.
>>
>> Thinking about it, I guess "folio_estimated_sharers()" might be what we
>> actually want to name it. Then you can comment how we estimate sharers
>> by looking at into how many page tables the first subpage is currently
>> mapped, and assume the same holds true for the other subpages.
>>
>> It's unreliable because other subpages might behave differently, we
>> might not be holding the pagelock to stabilize, and we're not looking at
>> indirect mappings via the swapcache. But it's a comapratively good
>> estimate for most scenarios I guess.
>>
>
> Hmm, how about simply call it folio_hpage_mapcount(),
> folio_firstpage_mapcount(), or, folio_cover_mapcount() ?
All not better IMHO.
folio_estimated_subpage_mapcount() is a bit too verbose for my taste and ...
>
> It is used to replace page_mapcount() in that sense -
> https://lore.kernel.org/linux-mm/Y9MDJuPWsk9820xD@x1n/T/#me0531cfb9e82ad5ca88804c727d69cc6b9b33ffa
>
> if (flags & (MPOL_MF_MOVE_ALL) ||
> (flags & MPOL_MF_MOVE && folio_estimated_mapcount(folio) == 1 &&
> !hugetlb_pmd_shared(pte))) {
> if (isolate_hugetlb(folio, qp->pagelist) &&
... what we want to have here is an estimation on the number of sharers.
[actually, we would want it precise, but that's hard to achieve ... ]
--
Thanks,
David / dhildenb