2021-07-10 00:26:03

by Mike Kravetz

[permalink] [raw]
Subject: [PATCH 3/3] hugetlb: before freeing hugetlb page set dtor to appropriate value

When removing a hugetlb page from the pool the ref count is set to
one (as the free page has no ref count) and compound page destructor
is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the
hugetlb page will call __free_pages for non-gigantic pages and
free_gigantic_page for gigantic pages the destructor is not used.

However, consider the following race with code taking a speculative
reference on the page:

Thread 0 Thread 1
-------- --------
remove_hugetlb_page
set_page_refcounted(page);
set_compound_page_dtor(page,
NULL_COMPOUND_DTOR);
get_page_unless_zero(page)
__update_and_free_page
__free_pages(page,
huge_page_order(h));

/* Note that __free_pages() will simply drop
the reference to the page. */

put_page(page)
__put_compound_page()
destroy_compound_page
NULL_COMPOUND_DTOR
BUG: kernel NULL pointer
dereference, address:
0000000000000000

To address this race, set the dtor to the normal compound page dtor
for non-gigantic pages. The dtor for gigantic pages does not matter
as gigantic pages are changed from a compound page to 'just a group of
pages' before freeing. Hence, the destructor is not used.

Signed-off-by: Mike Kravetz <[email protected]>
---
mm/hugetlb.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3132c7395743..fa8ec2072949 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1370,8 +1370,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page,
h->surplus_huge_pages_node[nid]--;
}

+ /*
+ * Very subtle
+ *
+ * For non-gigantic pages set the destructor to the normal compound
+ * page dtor. This is needed in case someone takes an additional
+ * temporary ref to the page, and freeing is delayed until they drop
+ * their reference.
+ *
+ * For gigantic pages set the destructor to the null dtor. This
+ * destructor will never be called. Before freeing the gigantic
+ * page destroy_compound_gigantic_page will turn the compound page
+ * into a simple group of pages. After this the destructor does not
+ * apply.
+ *
+ * This handles the case where more than one ref is held when and
+ * after update_and_free_page is called.
+ */
set_page_refcounted(page);
- set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
+ if (hstate_is_gigantic(h))
+ set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
+ else
+ set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);

h->nr_huge_pages--;
h->nr_huge_pages_node[nid]--;
--
2.31.1


2021-07-14 10:59:35

by Muchun Song

[permalink] [raw]
Subject: Re: [External] [PATCH 3/3] hugetlb: before freeing hugetlb page set dtor to appropriate value

On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <[email protected]> wrote:
>
> When removing a hugetlb page from the pool the ref count is set to
> one (as the free page has no ref count) and compound page destructor
> is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the
> hugetlb page will call __free_pages for non-gigantic pages and
> free_gigantic_page for gigantic pages the destructor is not used.
>
> However, consider the following race with code taking a speculative
> reference on the page:
>
> Thread 0 Thread 1
> -------- --------
> remove_hugetlb_page
> set_page_refcounted(page);
> set_compound_page_dtor(page,
> NULL_COMPOUND_DTOR);
> get_page_unless_zero(page)
> __update_and_free_page
> __free_pages(page,
> huge_page_order(h));
>
> /* Note that __free_pages() will simply drop
> the reference to the page. */
>
> put_page(page)
> __put_compound_page()
> destroy_compound_page
> NULL_COMPOUND_DTOR
> BUG: kernel NULL pointer
> dereference, address:
> 0000000000000000
>
> To address this race, set the dtor to the normal compound page dtor
> for non-gigantic pages. The dtor for gigantic pages does not matter
> as gigantic pages are changed from a compound page to 'just a group of
> pages' before freeing. Hence, the destructor is not used.
>
> Signed-off-by: Mike Kravetz <[email protected]>
> ---
> mm/hugetlb.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 3132c7395743..fa8ec2072949 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1370,8 +1370,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page,
> h->surplus_huge_pages_node[nid]--;
> }
>
> + /*
> + * Very subtle
> + *
> + * For non-gigantic pages set the destructor to the normal compound
> + * page dtor. This is needed in case someone takes an additional
> + * temporary ref to the page, and freeing is delayed until they drop
> + * their reference.
> + *
> + * For gigantic pages set the destructor to the null dtor. This
> + * destructor will never be called. Before freeing the gigantic
> + * page destroy_compound_gigantic_page will turn the compound page
> + * into a simple group of pages. After this the destructor does not
> + * apply.
> + *
> + * This handles the case where more than one ref is held when and
> + * after update_and_free_page is called.
> + */
> set_page_refcounted(page);
> - set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
> + if (hstate_is_gigantic(h))
> + set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
> + else
> + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);

Hi Mike,

The race is really subtle. But we also should remove the WARN from
free_contig_range, right? Because the refcount of the head page of
the gigantic page can be greater than one, but free_contig_range has
the following warning.

WARN(count != 0, "%lu pages are still in use!\n", count);

Thanks.

>
> h->nr_huge_pages--;
> h->nr_huge_pages_node[nid]--;
> --
> 2.31.1
>

2021-07-14 17:41:16

by Mike Kravetz

[permalink] [raw]
Subject: Re: [External] [PATCH 3/3] hugetlb: before freeing hugetlb page set dtor to appropriate value

On 7/14/21 3:57 AM, Muchun Song wrote:
> On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <[email protected]> wrote:
>> + /*
>> + * Very subtle
>> + *
>> + * For non-gigantic pages set the destructor to the normal compound
>> + * page dtor. This is needed in case someone takes an additional
>> + * temporary ref to the page, and freeing is delayed until they drop
>> + * their reference.
>> + *
>> + * For gigantic pages set the destructor to the null dtor. This
>> + * destructor will never be called. Before freeing the gigantic
>> + * page destroy_compound_gigantic_page will turn the compound page
>> + * into a simple group of pages. After this the destructor does not
>> + * apply.
>> + *
>> + * This handles the case where more than one ref is held when and
>> + * after update_and_free_page is called.
>> + */
>> set_page_refcounted(page);
>> - set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
>> + if (hstate_is_gigantic(h))
>> + set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
>> + else
>> + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
>
> Hi Mike,
>
> The race is really subtle. But we also should remove the WARN from
> free_contig_range, right? Because the refcount of the head page of
> the gigantic page can be greater than one, but free_contig_range has
> the following warning.
>
> WARN(count != 0, "%lu pages are still in use!\n", count);
>

I did hit that warning in my testing and thought about removing it.
However, I decided to keep it because non-hugetlb code also makes use of
alloc_contig_range/free_contig_range and it might be useful in those
cases.

My 'guess' is that the warning was added not because of temporary ref
count increases but rather to point out any code that forgot to drop a
reference.

BTW - It is not just the 'head' page which could trigger this warning, but
any 'tail' page as well. That is because we do not call free_contig_range
with a compound page, but rather a group of pages all with ref count of
at least one.

I'm happy to remove the warning if people do not think it is generally
useful.
--
Mike Kravetz

2021-07-15 03:17:57

by Muchun Song

[permalink] [raw]
Subject: Re: [External] [PATCH 3/3] hugetlb: before freeing hugetlb page set dtor to appropriate value

On Thu, Jul 15, 2021 at 1:39 AM Mike Kravetz <[email protected]> wrote:
>
> On 7/14/21 3:57 AM, Muchun Song wrote:
> > On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz <[email protected]> wrote:
> >> + /*
> >> + * Very subtle
> >> + *
> >> + * For non-gigantic pages set the destructor to the normal compound
> >> + * page dtor. This is needed in case someone takes an additional
> >> + * temporary ref to the page, and freeing is delayed until they drop
> >> + * their reference.
> >> + *
> >> + * For gigantic pages set the destructor to the null dtor. This
> >> + * destructor will never be called. Before freeing the gigantic
> >> + * page destroy_compound_gigantic_page will turn the compound page
> >> + * into a simple group of pages. After this the destructor does not
> >> + * apply.
> >> + *
> >> + * This handles the case where more than one ref is held when and
> >> + * after update_and_free_page is called.
> >> + */
> >> set_page_refcounted(page);
> >> - set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
> >> + if (hstate_is_gigantic(h))
> >> + set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
> >> + else
> >> + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
> >
> > Hi Mike,
> >
> > The race is really subtle. But we also should remove the WARN from
> > free_contig_range, right? Because the refcount of the head page of
> > the gigantic page can be greater than one, but free_contig_range has
> > the following warning.
> >
> > WARN(count != 0, "%lu pages are still in use!\n", count);
> >
>
> I did hit that warning in my testing and thought about removing it.
> However, I decided to keep it because non-hugetlb code also makes use of
> alloc_contig_range/free_contig_range and it might be useful in those
> cases.
>
> My 'guess' is that the warning was added not because of temporary ref
> count increases but rather to point out any code that forgot to drop a
> reference.

Got it. At least this patch looks good to me. So

Reviewed-by: Muchun Song <[email protected]>

>
> BTW - It is not just the 'head' page which could trigger this warning, but
> any 'tail' page as well. That is because we do not call free_contig_range
> with a compound page, but rather a group of pages all with ref count of
> at least one.

Right.

>
> I'm happy to remove the warning if people do not think it is generally
> useful.

For me, I suggest removing it. If someone has any ideas, please
let us know.

> --
> Mike Kravetz