From: Naoya Horiguchi <[email protected]>
When memory_failure() or soft_offline_page() is called on a tail page of
some hugetlb page, "BUG: unable to handle page fault" error can be
triggered.
remove_hugetlb_page() dereferences page->lru, so it's assumed that the
page points to a head page, but one of the caller,
dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
which could be a tail page. So pass 'head' to it, instead.
Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
Signed-off-by: Naoya Horiguchi <[email protected]>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git v5.13-rc3/mm/hugetlb.c v5.13-rc3_patched/mm/hugetlb.c
index 95918f410c0f..470f7b5b437e 100644
--- v5.13-rc3/mm/hugetlb.c
+++ v5.13-rc3_patched/mm/hugetlb.c
@@ -1793,7 +1793,7 @@ int dissolve_free_huge_page(struct page *page)
SetPageHWPoison(page);
ClearPageHWPoison(head);
}
- remove_hugetlb_page(h, page, false);
+ remove_hugetlb_page(h, head, false);
h->max_huge_pages--;
spin_unlock_irq(&hugetlb_lock);
update_and_free_page(h, head);
--
2.25.1
On Thu, May 27, 2021 at 08:52:57AM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
I'd like to point out that with folios, this is a compile-time error,
not a run-time error.
On 5/26/21 4:52 PM, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
>
> Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> Signed-off-by: Naoya Horiguchi <[email protected]>
> ---
> mm/hugetlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks Naoya!
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz
>
> diff --git v5.13-rc3/mm/hugetlb.c v5.13-rc3_patched/mm/hugetlb.c
> index 95918f410c0f..470f7b5b437e 100644
> --- v5.13-rc3/mm/hugetlb.c
> +++ v5.13-rc3_patched/mm/hugetlb.c
> @@ -1793,7 +1793,7 @@ int dissolve_free_huge_page(struct page *page)
> SetPageHWPoison(page);
> ClearPageHWPoison(head);
> }
> - remove_hugetlb_page(h, page, false);
> + remove_hugetlb_page(h, head, false);
> h->max_huge_pages--;
> spin_unlock_irq(&hugetlb_lock);
> update_and_free_page(h, head);
>
On Thu, May 27, 2021 at 7:53 AM Naoya Horiguchi <[email protected]> wrote:
>
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
>
> Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> Signed-off-by: Naoya Horiguchi <[email protected]>
Thanks Naoya!
Reviewed-by: Muchun Song <[email protected]>
On Thu 27-05-21 08:52:57, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
>
> Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> Signed-off-by: Naoya Horiguchi <[email protected]>
This is really nasty and easy to overlook. I have completely missed that
when reviewing and I do remember checking for head vs page as there is
quite some non trivial handling of both here.
Acked-by: Michal Hocko <[email protected]>
Thanks!
> ---
> mm/hugetlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.13-rc3/mm/hugetlb.c v5.13-rc3_patched/mm/hugetlb.c
> index 95918f410c0f..470f7b5b437e 100644
> --- v5.13-rc3/mm/hugetlb.c
> +++ v5.13-rc3_patched/mm/hugetlb.c
> @@ -1793,7 +1793,7 @@ int dissolve_free_huge_page(struct page *page)
> SetPageHWPoison(page);
> ClearPageHWPoison(head);
> }
> - remove_hugetlb_page(h, page, false);
> + remove_hugetlb_page(h, head, false);
> h->max_huge_pages--;
> spin_unlock_irq(&hugetlb_lock);
> update_and_free_page(h, head);
> --
> 2.25.1
--
Michal Hocko
SUSE Labs
On Thu, May 27, 2021 at 08:52:57AM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
>
> Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
It is probably worth adding a comment in remove_hugetlb_page() noting
that we need a head page, so future users do not repeat the same
mistake.
Thanks
--
Oscar Salvador
SUSE L3
On Thu 27-05-21 09:47:44, Oscar Salvador wrote:
> On Thu, May 27, 2021 at 08:52:57AM +0900, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <[email protected]>
> >
> > When memory_failure() or soft_offline_page() is called on a tail page of
> > some hugetlb page, "BUG: unable to handle page fault" error can be
> > triggered.
> >
> > remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> > page points to a head page, but one of the caller,
> > dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> > which could be a tail page. So pass 'head' to it, instead.
> >
> > Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> > Signed-off-by: Naoya Horiguchi <[email protected]>
>
> Reviewed-by: Oscar Salvador <[email protected]>
>
> It is probably worth adding a comment in remove_hugetlb_page() noting
> that we need a head page, so future users do not repeat the same
> mistake.
Ideally this will turn into page folio concept and no comments are
really needed.
--
Michal Hocko
SUSE Labs
On 5/26/21 4:52 PM, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> When memory_failure() or soft_offline_page() is called on a tail page of
> some hugetlb page, "BUG: unable to handle page fault" error can be
> triggered.
>
> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> page points to a head page, but one of the caller,
> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> which could be a tail page. So pass 'head' to it, instead.
>
> Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> Signed-off-by: Naoya Horiguchi <[email protected]>
> ---
> mm/hugetlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.13-rc3/mm/hugetlb.c v5.13-rc3_patched/mm/hugetlb.c
> index 95918f410c0f..470f7b5b437e 100644
> --- v5.13-rc3/mm/hugetlb.c
> +++ v5.13-rc3_patched/mm/hugetlb.c
> @@ -1793,7 +1793,7 @@ int dissolve_free_huge_page(struct page *page)
> SetPageHWPoison(page);
> ClearPageHWPoison(head);
> }
> - remove_hugetlb_page(h, page, false);
> + remove_hugetlb_page(h, head, false);
> h->max_huge_pages--;
> spin_unlock_irq(&hugetlb_lock);
> update_and_free_page(h, head);
>
I believe we have the same problem later in the routine when calling
add_hugetlb_page()?
If so, should we combine the changes? Or, do we need two patches as
the bugs were introduced with different commits?
--
Mike Kravetz
On Thu 27-05-21 09:28:51, Mike Kravetz wrote:
> On 5/26/21 4:52 PM, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <[email protected]>
> >
> > When memory_failure() or soft_offline_page() is called on a tail page of
> > some hugetlb page, "BUG: unable to handle page fault" error can be
> > triggered.
> >
> > remove_hugetlb_page() dereferences page->lru, so it's assumed that the
> > page points to a head page, but one of the caller,
> > dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
> > which could be a tail page. So pass 'head' to it, instead.
> >
> > Fixes: 6eb4e88a6d27 ("hugetlb: create remove_hugetlb_page() to separate functionality")
> > Signed-off-by: Naoya Horiguchi <[email protected]>
> > ---
> > mm/hugetlb.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git v5.13-rc3/mm/hugetlb.c v5.13-rc3_patched/mm/hugetlb.c
> > index 95918f410c0f..470f7b5b437e 100644
> > --- v5.13-rc3/mm/hugetlb.c
> > +++ v5.13-rc3_patched/mm/hugetlb.c
> > @@ -1793,7 +1793,7 @@ int dissolve_free_huge_page(struct page *page)
> > SetPageHWPoison(page);
> > ClearPageHWPoison(head);
> > }
> > - remove_hugetlb_page(h, page, false);
> > + remove_hugetlb_page(h, head, false);
> > h->max_huge_pages--;
> > spin_unlock_irq(&hugetlb_lock);
> > update_and_free_page(h, head);
> >
>
> I believe we have the same problem later in the routine when calling
> add_hugetlb_page()?
Can we ever get a tail page there?
> If so, should we combine the changes? Or, do we need two patches as
> the bugs were introduced with different commits?
If there is an issue then I would go with a separate patch.
Thanks!
--
Michal Hocko
SUSE Labs
On 5/27/21 12:54 PM, Michal Hocko wrote:
> On Thu 27-05-21 09:28:51, Mike Kravetz wrote:
>> On 5/26/21 4:52 PM, Naoya Horiguchi wrote:
>>> From: Naoya Horiguchi <[email protected]>
>>>
>>> remove_hugetlb_page() dereferences page->lru, so it's assumed that the
>>> page points to a head page, but one of the caller,
>>> dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
>>> which could be a tail page. So pass 'head' to it, instead.
>>>
>>
>> I believe we have the same problem later in the routine when calling
>> add_hugetlb_page()?
>
> Can we ever get a tail page there?
>
Yes. Actually alloc_huge_page_vmemmap() and add_hugetlb_page() calls
later in the same block of code expect head page but we pass 'page' which
could be a tail page.
>> If so, should we combine the changes? Or, do we need two patches as
>> the bugs were introduced with different commits?
>
> If there is an issue then I would go with a separate patch.
>
The calls to alloc_huge_page_vmemmap() and add_hugetlb_page() are only
in Andrew's tree (and next). So, they can be fixed there.
I'll send a patch shortly.
--
Mike Kravetz