2022-08-29 23:30:10

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 0/7] begin converting hugetlb code to folios

This patch series starts the conversion of the hugetlb code to operate
on struct folios rather than struct pages. This removes the ambiguitiy
of whether functions are operating on head pages, tail pages of compound
pages, or base pages.

This series passes the linux test project hugetlb test cases.

Patch 1 adds hugeltb specific page macros that can operate on folios.

Patch 2 adds the private field of the first tail page to struct page.
This patch depends on Matthew Wilcox's patch mm: Add the first tail
page to struct folio[1]:

Patchs 3-4 introduce hugetlb subpool helper functions which operate on
struct folios. These patches were tested using the hugepage-mmap.c
selftest along with the migratepages command.

Patch 5 converts hugetlb_delete_from_page_cache() to use folios.
This patch depends on Mike Kravetz's patch: hugetlb: rename
remove_huge_page to hugetlb_delete_from_page_cache[2]:


Patch 6 adds a folio_hstate() function to get hstate information from a
folio.

Patch 7 adds a user of folio_hstate().

Bpftrace was used to track time spent in the free_huge_pages function
during the ltp test cases as it is a caller of the hugetlb subpool
functions. From the histogram, the performance is similar before and
after the patch series.

Time spent in 'free_huge_page'

6.0.0-rc2.master.20220823
@nsecs:
[256, 512) 14770 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[512, 1K) 155 | |
[1K, 2K) 169 | |
[2K, 4K) 50 | |
[4K, 8K) 14 | |
[8K, 16K) 3 | |
[16K, 32K) 3 | |


6.0.0-rc2.master.20220823 + patch series
@nsecs:
[256, 512) 13678 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[512, 1K) 142 | |
[1K, 2K) 199 | |
[2K, 4K) 44 | |
[4K, 8K) 13 | |
[8K, 16K) 4 | |
[16K, 32K) 1 | |

[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/T/#me431952361ea576862d7eb617a5dced9807dbabb

Sidhartha Kumar (7):
mm/hugetlb: add folio support to hugetlb specific flag macros
mm: add private field of first tail to struct page and struct folio
mm/hugetlb: add hugetlb_folio_subpool() helper
mm/hugetlb: add hugetlb_set_folio_subpool() helper
mm/hugetlb: convert hugetlb_delete_from_page_cache() to use folios
mm/hugetlb add folio_hstate()
mm/migrate: use folio_hstate() in alloc_migration_target()

fs/hugetlbfs/inode.c | 22 ++++++++++----------
include/linux/hugetlb.h | 45 ++++++++++++++++++++++++++++++++++++----
include/linux/mm_types.h | 15 ++++++++++++++
mm/migrate.c | 2 +-
4 files changed, 68 insertions(+), 16 deletions(-)

--
2.31.1


2022-08-29 23:31:17

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 7/7] mm/migrate: use folio_hstate() in alloc_migration_target()

Allows alloc_migration_target to pass in a folio to get hstate information.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..55392a706493 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1589,7 +1589,7 @@ struct page *alloc_migration_target(struct page *page, unsigned long private)
nid = folio_nid(folio);

if (folio_test_hugetlb(folio)) {
- struct hstate *h = page_hstate(&folio->page);
+ struct hstate *h = folio_hstate(folio);

gfp_mask = htlb_modify_alloc_mask(h, gfp_mask);
return alloc_huge_page_nodemask(h, nid, mtc->nmask, gfp_mask);
--
2.31.1

2022-08-29 23:47:03

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 2/7] mm: add private field of first tail to struct page and struct folio

Allows struct folio to store hugetlb metadata that is contained in the
private field of the first tail page.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
include/linux/mm_types.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8a9ee9d24973..726c5304172c 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -144,6 +144,7 @@ struct page {
#ifdef CONFIG_64BIT
unsigned int compound_nr; /* 1 << compound_order */
#endif
+ unsigned long _private_1;
};
struct { /* Second tail page of compound page */
unsigned long _compound_pad_1; /* compound_head */
@@ -251,6 +252,7 @@ struct page {
* @_total_mapcount: Do not use directly, call folio_entire_mapcount().
* @_pincount: Do not use directly, call folio_maybe_dma_pinned().
* @_folio_nr_pages: Do not use directly, call folio_nr_pages().
+ * @_private_1: Do not use directly, call folio_get_private_1().
*
* A folio is a physically, virtually and logically contiguous set
* of bytes. It is a power-of-two in size, and it is aligned to that
@@ -298,6 +300,8 @@ struct folio {
#ifdef CONFIG_64BIT
unsigned int _folio_nr_pages;
#endif
+ unsigned long _private_1;
+
};

#define FOLIO_MATCH(pg, fl) \
@@ -325,6 +329,7 @@ FOLIO_MATCH(compound_mapcount, _total_mapcount);
FOLIO_MATCH(compound_pincount, _pincount);
#ifdef CONFIG_64BIT
FOLIO_MATCH(compound_nr, _folio_nr_pages);
+FOLIO_MATCH(_private_1, _private_1);
#endif
#undef FOLIO_MATCH

@@ -370,6 +375,16 @@ static inline void *folio_get_private(struct folio *folio)
return folio->private;
}

+static inline void folio_set_private_1(struct folio *folio, unsigned long private)
+{
+ folio->_private_1 = private;
+}
+
+static inline unsigned long folio_get_private_1(struct folio *folio)
+{
+ return folio->_private_1;
+}
+
struct page_frag_cache {
void * va;
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
--
2.31.1

2022-08-30 00:19:44

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 3/7] mm/hugetlb: add hugetlb_folio_subpool() helper

Allows hugetlbfs_migrate_folio to check subpool information by passing in a
folio.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
fs/hugetlbfs/inode.c | 4 ++--
include/linux/hugetlb.h | 7 ++++++-
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9326693c4987..d1a6384f426e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -970,9 +970,9 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
if (rc != MIGRATEPAGE_SUCCESS)
return rc;

- if (hugetlb_page_subpool(&src->page)) {
+ if (hugetlb_folio_subpool(src)) {
hugetlb_set_page_subpool(&dst->page,
- hugetlb_page_subpool(&src->page));
+ hugetlb_folio_subpool(src));
hugetlb_set_page_subpool(&src->page, NULL);
}

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c0a9bc9a6fa5..f6d5467c5ed8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -709,12 +709,17 @@ extern unsigned int default_hstate_idx;

#define default_hstate (hstates[default_hstate_idx])

+static inline struct hugepage_subpool *hugetlb_folio_subpool(struct folio *folio)
+{
+ return (void *)folio_get_private_1(folio);
+}
+
/*
* hugetlb page subpool pointer located in hpage[1].private
*/
static inline struct hugepage_subpool *hugetlb_page_subpool(struct page *hpage)
{
- return (void *)page_private(hpage + SUBPAGE_INDEX_SUBPOOL);
+ return hugetlb_folio_subpool(page_folio(hpage));
}

static inline void hugetlb_set_page_subpool(struct page *hpage,
--
2.31.1

2022-08-30 04:21:51

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/7] mm: add private field of first tail to struct page and struct folio

On Mon, Aug 29, 2022 at 04:00:09PM -0700, Sidhartha Kumar wrote:
> +++ b/include/linux/mm_types.h
> @@ -144,6 +144,7 @@ struct page {
> #ifdef CONFIG_64BIT
> unsigned int compound_nr; /* 1 << compound_order */
> #endif
> + unsigned long _private_1;
> };
> struct { /* Second tail page of compound page */
> unsigned long _compound_pad_1; /* compound_head */

Have you tested compiling this on 32-bit? I think you need to move
the _private_1 inside the ifdef CONFIG_64BIT.

> @@ -251,6 +252,7 @@ struct page {
> * @_total_mapcount: Do not use directly, call folio_entire_mapcount().
> * @_pincount: Do not use directly, call folio_maybe_dma_pinned().
> * @_folio_nr_pages: Do not use directly, call folio_nr_pages().
> + * @_private_1: Do not use directly, call folio_get_private_1().
> *
> * A folio is a physically, virtually and logically contiguous set
> * of bytes. It is a power-of-two in size, and it is aligned to that
> @@ -298,6 +300,8 @@ struct folio {
> #ifdef CONFIG_64BIT
> unsigned int _folio_nr_pages;
> #endif
> + unsigned long _private_1;

(but don't do that here!)

The intent is that _private_1 lines up with head[1].private on 32-bit.
It's a bit tricky, and I'm not sure that I'm thinking about it quite right.

> };
>
> #define FOLIO_MATCH(pg, fl) \
> @@ -325,6 +329,7 @@ FOLIO_MATCH(compound_mapcount, _total_mapcount);
> FOLIO_MATCH(compound_pincount, _pincount);
> #ifdef CONFIG_64BIT
> FOLIO_MATCH(compound_nr, _folio_nr_pages);
> +FOLIO_MATCH(_private_1, _private_1);
> #endif
> #undef FOLIO_MATCH
>
> @@ -370,6 +375,16 @@ static inline void *folio_get_private(struct folio *folio)
> return folio->private;
> }
>
> +static inline void folio_set_private_1(struct folio *folio, unsigned long private)
> +{
> + folio->_private_1 = private;
> +}
> +
> +static inline unsigned long folio_get_private_1(struct folio *folio)
> +{
> + return folio->_private_1;
> +}
> +
> struct page_frag_cache {
> void * va;
> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
> --
> 2.31.1
>

2022-09-01 18:06:02

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 2/7] mm: add private field of first tail to struct page and struct folio

On 08/29/22 16:00, Sidhartha Kumar wrote:
> Allows struct folio to store hugetlb metadata that is contained in the
> private field of the first tail page.
>
> Signed-off-by: Sidhartha Kumar <[email protected]>
> ---
> include/linux/mm_types.h | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 8a9ee9d24973..726c5304172c 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -144,6 +144,7 @@ struct page {
> #ifdef CONFIG_64BIT
> unsigned int compound_nr; /* 1 << compound_order */
> #endif
> + unsigned long _private_1;
> };
> struct { /* Second tail page of compound page */
> unsigned long _compound_pad_1; /* compound_head */
> @@ -251,6 +252,7 @@ struct page {
> * @_total_mapcount: Do not use directly, call folio_entire_mapcount().
> * @_pincount: Do not use directly, call folio_maybe_dma_pinned().
> * @_folio_nr_pages: Do not use directly, call folio_nr_pages().
> + * @_private_1: Do not use directly, call folio_get_private_1().
> *
> * A folio is a physically, virtually and logically contiguous set
> * of bytes. It is a power-of-two in size, and it is aligned to that

Not really an issue with this patch, but it made me read more of this
comment about folios. It goes on to say ...

* same power-of-two. It is at least as large as %PAGE_SIZE. If it is
* in the page cache, it is at a file offset which is a multiple of that
* power-of-two. It may be mapped into userspace at an address which is
* at an arbitrary page offset, but its kernel virtual address is aligned
* to its size.
*/

This series is to begin converting hugetlb code to folios. Just want to
note that 'hugetlb folios' have specific user space alignment restrictions.
So, I do not think the comment about arbitrary page offset would apply to
hugetlb.

Matthew, should we note that hugetlb is special in the comment? Or, is it
not worth updating?

Also, folio_get_private_1 will be used for the hugetlb subpool pointer
which resides in page[1].private. This is used in the next patch of
this series. I'm sure you are aware that hugetlb also uses page private
in sub pages 2 and 3. Can/will/should this method of accessing private
in sub pages be expanded to cover these as well? Expansion can happen
later, but if this can not be expanded perhaps we should come up with
another scheme.
--
Mike Kravetz



> @@ -298,6 +300,8 @@ struct folio {
> #ifdef CONFIG_64BIT
> unsigned int _folio_nr_pages;
> #endif
> + unsigned long _private_1;
> +
> };
>
> #define FOLIO_MATCH(pg, fl) \
> @@ -325,6 +329,7 @@ FOLIO_MATCH(compound_mapcount, _total_mapcount);
> FOLIO_MATCH(compound_pincount, _pincount);
> #ifdef CONFIG_64BIT
> FOLIO_MATCH(compound_nr, _folio_nr_pages);
> +FOLIO_MATCH(_private_1, _private_1);
> #endif
> #undef FOLIO_MATCH
>
> @@ -370,6 +375,16 @@ static inline void *folio_get_private(struct folio *folio)
> return folio->private;
> }
>
> +static inline void folio_set_private_1(struct folio *folio, unsigned long private)
> +{
> + folio->_private_1 = private;
> +}
> +
> +static inline unsigned long folio_get_private_1(struct folio *folio)
> +{
> + return folio->_private_1;
> +}
> +
> struct page_frag_cache {
> void * va;
> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
> --
> 2.31.1
>

2022-09-01 18:45:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/7] mm: add private field of first tail to struct page and struct folio

On Thu, Sep 01, 2022 at 10:32:43AM -0700, Mike Kravetz wrote:
> Not really an issue with this patch, but it made me read more of this
> comment about folios. It goes on to say ...
>
> * same power-of-two. It is at least as large as %PAGE_SIZE. If it is
> * in the page cache, it is at a file offset which is a multiple of that
> * power-of-two. It may be mapped into userspace at an address which is
> * at an arbitrary page offset, but its kernel virtual address is aligned
> * to its size.
> */
>
> This series is to begin converting hugetlb code to folios. Just want to
> note that 'hugetlb folios' have specific user space alignment restrictions.
> So, I do not think the comment about arbitrary page offset would apply to
> hugetlb.
>
> Matthew, should we note that hugetlb is special in the comment? Or, is it
> not worth updating?

I'm open to updating it if we can find good wording. What I'm trying
to get across there is that when dealing with folios, you can assume
that they're naturally aligned physically, logically (in the file) and
virtually (kernel address), but not necessarily virtually (user
address). Hugetlb folios are special in that they are guaranteed to
be virtually aligned in user space, but I don't know if here is the
right place to document that. It's an additional restriction, so code
which handles generic folios doesn't need to know it.

> Also, folio_get_private_1 will be used for the hugetlb subpool pointer
> which resides in page[1].private. This is used in the next patch of
> this series. I'm sure you are aware that hugetlb also uses page private
> in sub pages 2 and 3. Can/will/should this method of accessing private
> in sub pages be expanded to cover these as well? Expansion can happen
> later, but if this can not be expanded perhaps we should come up with
> another scheme.

There's a few ways of tackling this. What I'm currently thinking is
that we change how hugetlbfs uses struct page to store its extra data.
It would end up looking something like this (in struct page):

+++ b/include/linux/mm_types.h
@@ -147,9 +147,10 @@ struct page {
};
struct { /* Second tail page of compound page */
unsigned long _compound_pad_1; /* compound_head */
- unsigned long _compound_pad_2;
/* For both global and memcg */
struct list_head deferred_list;
+ unsigned long hugetlbfs_private_2;
+ unsigned long hugetlbfs_private_3;
};
struct { /* Page table pages */
unsigned long _pt_pad_1; /* compound_head */

although we could use better names and/or types? I haven't looked to
see what you're storing here yet. And then we can make the
corresponding change to struct folio to add these elements at the
right place.

Does that sound sensible?

2022-09-01 20:57:46

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 2/7] mm: add private field of first tail to struct page and struct folio

On 09/01/22 19:32, Matthew Wilcox wrote:
> On Thu, Sep 01, 2022 at 10:32:43AM -0700, Mike Kravetz wrote:
> > Not really an issue with this patch, but it made me read more of this
> > comment about folios. It goes on to say ...
> >
> > * same power-of-two. It is at least as large as %PAGE_SIZE. If it is
> > * in the page cache, it is at a file offset which is a multiple of that
> > * power-of-two. It may be mapped into userspace at an address which is
> > * at an arbitrary page offset, but its kernel virtual address is aligned
> > * to its size.
> > */
> >
> > This series is to begin converting hugetlb code to folios. Just want to
> > note that 'hugetlb folios' have specific user space alignment restrictions.
> > So, I do not think the comment about arbitrary page offset would apply to
> > hugetlb.
> >
> > Matthew, should we note that hugetlb is special in the comment? Or, is it
> > not worth updating?
>
> I'm open to updating it if we can find good wording. What I'm trying
> to get across there is that when dealing with folios, you can assume
> that they're naturally aligned physically, logically (in the file) and
> virtually (kernel address), but not necessarily virtually (user
> address). Hugetlb folios are special in that they are guaranteed to
> be virtually aligned in user space, but I don't know if here is the
> right place to document that. It's an additional restriction, so code
> which handles generic folios doesn't need to know it.

Fair enough. No need to change. It just caught my eye.

> > Also, folio_get_private_1 will be used for the hugetlb subpool pointer
> > which resides in page[1].private. This is used in the next patch of
> > this series. I'm sure you are aware that hugetlb also uses page private
> > in sub pages 2 and 3. Can/will/should this method of accessing private
> > in sub pages be expanded to cover these as well? Expansion can happen
> > later, but if this can not be expanded perhaps we should come up with
> > another scheme.
>
> There's a few ways of tackling this. What I'm currently thinking is
> that we change how hugetlbfs uses struct page to store its extra data.
> It would end up looking something like this (in struct page):
>
> +++ b/include/linux/mm_types.h
> @@ -147,9 +147,10 @@ struct page {
> };
> struct { /* Second tail page of compound page */
> unsigned long _compound_pad_1; /* compound_head */
> - unsigned long _compound_pad_2;
> /* For both global and memcg */
> struct list_head deferred_list;
> + unsigned long hugetlbfs_private_2;
> + unsigned long hugetlbfs_private_3;
> };
> struct { /* Page table pages */
> unsigned long _pt_pad_1; /* compound_head */
>
> although we could use better names and/or types? I haven't looked to
> see what you're storing here yet. And then we can make the
> corresponding change to struct folio to add these elements at the
> right place.

I am terrible at names. hugetlb is storing pointers in the private fields.
FWICT, something like this would work.

>
> Does that sound sensible?

--
Mike Kravetz