2023-01-23 20:25:10

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

The head page variable is not needed as we can use folio equivalent
functions.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
mm/memory_hotplug.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a1e8c3e9ab08..ad09189786b1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1624,7 +1624,7 @@ static int
do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
- struct page *page, *head;
+ struct page *page;
int ret = 0;
LIST_HEAD(source);
static DEFINE_RATELIMIT_STATE(migrate_rs, DEFAULT_RATELIMIT_INTERVAL,
@@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
continue;
page = pfn_to_page(pfn);
folio = page_folio(page);
- head = &folio->page;

- if (PageHuge(page)) {
- pfn = page_to_pfn(head) + compound_nr(head) - 1;
+ if (folio_test_hugetlb(folio)) {
+ pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
isolate_hugetlb(folio, &source);
continue;
- } else if (PageTransHuge(page))
- pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+ } else if (folio_test_transhuge(folio))
+ pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;

/*
* HWPoison pages have elevated reference counts so the migration would
--
2.39.1



2023-01-23 20:25:12

by Sidhartha Kumar

[permalink] [raw]
Subject: [PATCH 2/2] mm/memory_hotplug: remove head page reference in scan_movable_pages()

Remove one user of the Huge Page macros which take in a page. Also remove
a reference to a head page variable by using a folio instead.

Signed-off-by: Sidhartha Kumar <[email protected]>
---
mm/memory_hotplug.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index ad09189786b1..d5a0672cdf3a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1579,7 +1579,8 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
unsigned long pfn;

for (pfn = start; pfn < end; pfn++) {
- struct page *page, *head;
+ struct page *page;
+ struct folio *folio;
unsigned long skip;

if (!pfn_valid(pfn))
@@ -1599,9 +1600,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
if (PageOffline(page) && page_count(page))
return -EBUSY;

- if (!PageHuge(page))
+ folio = page_folio(page);
+ if (!folio_test_hugetlb(folio))
continue;
- head = compound_head(page);
/*
* This test is racy as we hold no reference or lock. The
* hugetlb page could have been free'ed and head is no longer
@@ -1609,9 +1610,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
* cases false positives and negatives are possible. Calling
* code must deal with these scenarios.
*/
- if (HPageMigratable(head))
+ if (folio_test_hugetlb_migratable(folio))
goto found;
- skip = compound_nr(head) - (page - head);
+ skip = folio_nr_pages(folio) - (page - &folio->page);
pfn += skip - 1;
}
return -ENOENT;
--
2.39.1


2023-01-23 20:39:38

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote:
> @@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> continue;
> page = pfn_to_page(pfn);
> folio = page_folio(page);
> - head = &folio->page;
>
> - if (PageHuge(page)) {
> - pfn = page_to_pfn(head) + compound_nr(head) - 1;
> + if (folio_test_hugetlb(folio)) {
> + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> isolate_hugetlb(folio, &source);
> continue;
> - } else if (PageTransHuge(page))
> - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> + } else if (folio_test_transhuge(folio))
> + pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;

I'm pretty sure those two lines should be...

} else if (folio_test_large(folio))
pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;

But, erm ... we're doing this before we have a refcount on the page,
right? So this is unsafe because the page might change which folio
it is in. And the folio we found earlier might become a tail page
of a different folio. (As the comment below explains, HWPoison pages
won't, so it's not unsafe for them).

Also, thp_nr_pages(page) is going to return 1 for tail pages. So this
is a noop, unless page is a head page.

It's all a bit confusing, and being memory-hotplug, it's not well
tested. More thought needed.

2023-01-23 20:41:46

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm/memory_hotplug: remove head page reference in scan_movable_pages()

On Mon, Jan 23, 2023 at 12:23:47PM -0800, Sidhartha Kumar wrote:
> - if (!PageHuge(page))
> + folio = page_folio(page);
> + if (!folio_test_hugetlb(folio))
> continue;
> - head = compound_head(page);
> /*
> * This test is racy as we hold no reference or lock. The
> * hugetlb page could have been free'ed and head is no longer

Assuming the comment is correct, this patch is also unsafe.


2023-01-23 21:09:19

by Sidhartha Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

On 1/23/23 12:37 PM, Matthew Wilcox wrote:
> On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote:
>> @@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>> continue;
>> page = pfn_to_page(pfn);
>> folio = page_folio(page);
>> - head = &folio->page;
>>
>> - if (PageHuge(page)) {
>> - pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> + if (folio_test_hugetlb(folio)) {
>> + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>> isolate_hugetlb(folio, &source);
>> continue;
>> - } else if (PageTransHuge(page))
>> - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>> + } else if (folio_test_transhuge(folio))
>> + pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;
>
> I'm pretty sure those two lines should be...
>
> } else if (folio_test_large(folio) > pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>
> But, erm ... we're doing this before we have a refcount on the page,
> right? So this is unsafe because the page might change which folio
> it is in. And the folio we found earlier might become a tail page
> of a different folio. (As the comment below explains, HWPoison pages
> won't, so it's not unsafe for them).
>

Thanks for the explanation of why this is unsafe. Would it be worth to
put this code block inside the

if (!get_page_unless_zero(page))
continue;

put_page(page);

block found lower? My motivation for this series is the HPageMigratable
call in patch 2 is the last user of the huge page flag test macros so a
conversion would allow for the removal of the macro. I thought I could
also remove the head page references found in this function, but if that
would cause too much churn in a complicated sub-system it can be dropped.

Thanks,
Sidhartha Kumar
> Also, thp_nr_pages(page) is going to return 1 for tail pages. So this
> is a noop, unless page is a head page.
>
> It's all a bit confusing, and being memory-hotplug, it's not well
> tested. More thought needed.


2023-01-24 02:32:20

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

On Mon, Jan 23, 2023 at 01:08:49PM -0800, Sidhartha Kumar wrote:
> On 1/23/23 12:37 PM, Matthew Wilcox wrote:
> > On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote:
> > > @@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
> > > continue;
> > > page = pfn_to_page(pfn);
> > > folio = page_folio(page);
> > > - head = &folio->page;
> > > - if (PageHuge(page)) {
> > > - pfn = page_to_pfn(head) + compound_nr(head) - 1;
> > > + if (folio_test_hugetlb(folio)) {
> > > + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> > > isolate_hugetlb(folio, &source);
> > > continue;
> > > - } else if (PageTransHuge(page))
> > > - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> > > + } else if (folio_test_transhuge(folio))
> > > + pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;
> >
> > I'm pretty sure those two lines should be...
> >
> > } else if (folio_test_large(folio) > pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> >
> > But, erm ... we're doing this before we have a refcount on the page,
> > right? So this is unsafe because the page might change which folio
> > it is in. And the folio we found earlier might become a tail page
> > of a different folio. (As the comment below explains, HWPoison pages
> > won't, so it's not unsafe for them).
> >
>
> Thanks for the explanation of why this is unsafe. Would it be worth to put
> this code block inside the
>
> if (!get_page_unless_zero(page))
> continue;
>
> put_page(page);
>
> block found lower? My motivation for this series is the HPageMigratable call
> in patch 2 is the last user of the huge page flag test macros so a
> conversion would allow for the removal of the macro. I thought I could also
> remove the head page references found in this function, but if that would
> cause too much churn in a complicated sub-system it can be dropped.

I think we just have to be very careful when working without a page ref.

Now, specifically to the matter of converting HPageMigratable(), I think
that's fine. Your folio_test_hugetlb_##flname macro does not have a
VM_BUG_ON_PGFLAGS(PageTail(page), page) in it, unlike folio_flags().
So it looks like even if your folio becomes a tail page in the middle
of scan_movable_pages(), you won't hit a BUG().

Now, should we go further? Possibly. But I'm more concerned that we
haven't really figured out which functions should be checking this.
Maybe we should drop the BUG entirely and rely more on the type system
(and people not casting) to prevent errors.

We could go a lot further with the type system and define a new type for
"might be a folio but we don't have a refcount on it". But we don't
do a lot of work with unreferenced folios, so I'm not inclined to go to
all that effort.

Perhaps we want a folio_maybe_X() series of functions that don't warn
if the folio has morphed into not-a-folio. I don't know.

2023-01-24 10:18:44

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range

On 23.01.23 21:37, Matthew Wilcox wrote:
> On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote:
>> @@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>> continue;
>> page = pfn_to_page(pfn);
>> folio = page_folio(page);
>> - head = &folio->page;
>>
>> - if (PageHuge(page)) {
>> - pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> + if (folio_test_hugetlb(folio)) {
>> + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>> isolate_hugetlb(folio, &source);
>> continue;
>> - } else if (PageTransHuge(page))
>> - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>> + } else if (folio_test_transhuge(folio))
>> + pfn = folio_pfn(folio) + thp_nr_pages(page) - 1;
>
> I'm pretty sure those two lines should be...
>
> } else if (folio_test_large(folio))
> pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>
> But, erm ... we're doing this before we have a refcount on the page,
> right? So this is unsafe because the page might change which folio
> it is in. And the folio we found earlier might become a tail page
> of a different folio. (As the comment below explains, HWPoison pages
> won't, so it's not unsafe for them).
>
> Also, thp_nr_pages(page) is going to return 1 for tail pages. So this
> is a noop, unless page is a head page.
>
> It's all a bit confusing, and being memory-hotplug, it's not well
> tested. More thought needed.

Ehm, it is fairly well tested ;)

As memory offlining keeps retrying, temporarily making wrong assumptions
about a folio is acceptable, as long as we don't run into BUGs.

It's certainly worth a big comment in a code, that this is all racy and
that page migration code will stabilize.

Now, we could temporarily take a reference, but ... common migration
code will try taking its own ref to stabilize the page and would be
confused about yet another ref (-> migration will fail).

So we have to be careful about grabbing references on these pages, and
how long we're going to hold them. Otherwise we'll break memory
offlining completely :)

--
Thanks,

David / dhildenb