2021-02-23 21:59:05

by Mike Kravetz

[permalink] [raw]
Subject: [PATCH] hugetlb: fix uninitialized subpool pointer

Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
with linux-next 5.12.0-20210222.
Call trace:
hugepage_subpool_put_pages.part.0+0x2c/0x138
__free_huge_page+0xce/0x310
alloc_pool_huge_page+0x102/0x120
set_max_huge_pages+0x13e/0x350
hugetlb_sysctl_handler_common+0xd8/0x110
hugetlb_sysctl_handler+0x48/0x58
proc_sys_call_handler+0x138/0x238
new_sync_write+0x10e/0x198
vfs_write.part.0+0x12c/0x238
ksys_write+0x68/0xf8
do_syscall+0x82/0xd0
__do_syscall+0xb4/0xc8
system_call+0x72/0x98

This is a result of the change which moved the hugetlb page subpool
pointer from page->private to page[1]->private. When new pages are
allocated from the buddy allocator, the private field of the head
page will be cleared, but the private field of subpages is not modified.
Therefore, old values may remain.

Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().

Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")
Reported-by: Gerald Schaefer <[email protected]>
Signed-off-by: Mike Kravetz <[email protected]>
---
mm/hugetlb.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c232cb67dda2..7ae5c18c98a7 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1465,6 +1465,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
{
INIT_LIST_HEAD(&page->lru);
set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
+ hugetlb_set_page_subpool(page, NULL);
set_hugetlb_cgroup(page, NULL);
set_hugetlb_cgroup_rsvd(page, NULL);
spin_lock(&hugetlb_lock);
--
2.29.2


2021-02-24 00:57:48

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On Tue, Feb 23, 2021 at 01:55:44PM -0800, Mike Kravetz wrote:
> Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> with linux-next 5.12.0-20210222.
> Call trace:
> hugepage_subpool_put_pages.part.0+0x2c/0x138
> __free_huge_page+0xce/0x310
> alloc_pool_huge_page+0x102/0x120
> set_max_huge_pages+0x13e/0x350
> hugetlb_sysctl_handler_common+0xd8/0x110
> hugetlb_sysctl_handler+0x48/0x58
> proc_sys_call_handler+0x138/0x238
> new_sync_write+0x10e/0x198
> vfs_write.part.0+0x12c/0x238
> ksys_write+0x68/0xf8
> do_syscall+0x82/0xd0
> __do_syscall+0xb4/0xc8
> system_call+0x72/0x98
>
> This is a result of the change which moved the hugetlb page subpool
> pointer from page->private to page[1]->private. When new pages are
> allocated from the buddy allocator, the private field of the head
> page will be cleared, but the private field of subpages is not modified.
> Therefore, old values may remain.
>
> Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
>
> Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")
> Reported-by: Gerald Schaefer <[email protected]>
> Signed-off-by: Mike Kravetz <[email protected]>

Do we need the hugetlb_set_page_subpool() in __free_huge_page?

Reviewed-by: Oscar Salvador <[email protected]>


--
Oscar Salvador
SUSE L3

2021-02-24 01:06:38

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On 2/23/21 2:58 PM, Oscar Salvador wrote:
> On 2021-02-23 23:55, Mike Kravetz wrote:
>> Yes, that is the more common case where the once active hugetlb page
>> will be simply added to the free list via enqueue_huge_page(). This
>> path does not go through prep_new_huge_page.
>
> Right, I see.
>
> Thanks

You got me thinking ...
When we dynamically allocate gigantic pages via alloc_contig_pages, we
will not use the buddy allocator. Therefore, the usual 'page prepping'
will not take place. Specifically, I could not find anything in that
path which clears page->private of the head page.
Am I missing that somewhere? If not, then we need to clear that as well
in prep_compound_gigantic_page. Or, just clear it in prep_new_huge_page
to handle any change in assumptions about the buddy allocator.

This is not something introduced with the recent field shuffling, it
looks like something that existed for some time.
--
Mike Kravetz

2021-02-24 01:53:53

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On 2/23/21 3:21 PM, Mike Kravetz wrote:
> On 2/23/21 2:58 PM, Oscar Salvador wrote:
>> On 2021-02-23 23:55, Mike Kravetz wrote:
>>> Yes, that is the more common case where the once active hugetlb page
>>> will be simply added to the free list via enqueue_huge_page(). This
>>> path does not go through prep_new_huge_page.
>>
>> Right, I see.
>>
>> Thanks
>
> You got me thinking ...
> When we dynamically allocate gigantic pages via alloc_contig_pages, we
> will not use the buddy allocator. Therefore, the usual 'page prepping'
> will not take place. Specifically, I could not find anything in that
> path which clears page->private of the head page.
> Am I missing that somewhere? If not, then we need to clear that as well
> in prep_compound_gigantic_page. Or, just clear it in prep_new_huge_page
> to handle any change in assumptions about the buddy allocator.
>
> This is not something introduced with the recent field shuffling, it
> looks like something that existed for some time.

nm, we do end up calling the same page prepping code (post_alloc_hook)
from alloc_contig_range->isolate_freepages_range.

Just to make sure, I purpously dirtied page->private of every page as it
was being freed. Gigantic page allocation was just fine, and I even ran
ltp mm tests with this dirtying in place.
--
Mike Kravetz

2021-02-24 05:23:35

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On 2021-02-23 23:55, Mike Kravetz wrote:
> Yes, that is the more common case where the once active hugetlb page
> will be simply added to the free list via enqueue_huge_page(). This
> path does not go through prep_new_huge_page.

Right, I see.

Thanks

--
Oscar Salvador
SUSE L3

2021-02-24 05:25:13

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On 2/23/21 2:45 PM, Oscar Salvador wrote:
> On Tue, Feb 23, 2021 at 01:55:44PM -0800, Mike Kravetz wrote:
>> Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
>> with linux-next 5.12.0-20210222.
>> Call trace:
>> hugepage_subpool_put_pages.part.0+0x2c/0x138
>> __free_huge_page+0xce/0x310
>> alloc_pool_huge_page+0x102/0x120
>> set_max_huge_pages+0x13e/0x350
>> hugetlb_sysctl_handler_common+0xd8/0x110
>> hugetlb_sysctl_handler+0x48/0x58
>> proc_sys_call_handler+0x138/0x238
>> new_sync_write+0x10e/0x198
>> vfs_write.part.0+0x12c/0x238
>> ksys_write+0x68/0xf8
>> do_syscall+0x82/0xd0
>> __do_syscall+0xb4/0xc8
>> system_call+0x72/0x98
>>
>> This is a result of the change which moved the hugetlb page subpool
>> pointer from page->private to page[1]->private. When new pages are
>> allocated from the buddy allocator, the private field of the head
>> page will be cleared, but the private field of subpages is not modified.
>> Therefore, old values may remain.
>>
>> Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
>>
>> Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")
>> Reported-by: Gerald Schaefer <[email protected]>
>> Signed-off-by: Mike Kravetz <[email protected]>
>
> Do we need the hugetlb_set_page_subpool() in __free_huge_page?

Yes, that is the more common case where the once active hugetlb page
will be simply added to the free list via enqueue_huge_page(). This
path does not go through prep_new_huge_page.

--
Mike Kravetz

2021-02-24 07:15:23

by Muchun Song

[permalink] [raw]
Subject: Re: [External] [PATCH] hugetlb: fix uninitialized subpool pointer

On Wed, Feb 24, 2021 at 5:56 AM Mike Kravetz <[email protected]> wrote:
>
> Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> with linux-next 5.12.0-20210222.
> Call trace:
> hugepage_subpool_put_pages.part.0+0x2c/0x138
> __free_huge_page+0xce/0x310
> alloc_pool_huge_page+0x102/0x120
> set_max_huge_pages+0x13e/0x350
> hugetlb_sysctl_handler_common+0xd8/0x110
> hugetlb_sysctl_handler+0x48/0x58
> proc_sys_call_handler+0x138/0x238
> new_sync_write+0x10e/0x198
> vfs_write.part.0+0x12c/0x238
> ksys_write+0x68/0xf8
> do_syscall+0x82/0xd0
> __do_syscall+0xb4/0xc8
> system_call+0x72/0x98
>
> This is a result of the change which moved the hugetlb page subpool
> pointer from page->private to page[1]->private. When new pages are
> allocated from the buddy allocator, the private field of the head
> page will be cleared, but the private field of subpages is not modified.
> Therefore, old values may remain.
>
> Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
>
> Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")
> Reported-by: Gerald Schaefer <[email protected]>
> Signed-off-by: Mike Kravetz <[email protected]>

Reviewed-by: Muchun Song <[email protected]>

Thanks.

> ---
> mm/hugetlb.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c232cb67dda2..7ae5c18c98a7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1465,6 +1465,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> {
> INIT_LIST_HEAD(&page->lru);
> set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> + hugetlb_set_page_subpool(page, NULL);
> set_hugetlb_cgroup(page, NULL);
> set_hugetlb_cgroup_rsvd(page, NULL);
> spin_lock(&hugetlb_lock);
> --
> 2.29.2
>

2021-02-24 09:49:34

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On Tue 23-02-21 13:55:44, Mike Kravetz wrote:
> Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> with linux-next 5.12.0-20210222.
> Call trace:
> hugepage_subpool_put_pages.part.0+0x2c/0x138
> __free_huge_page+0xce/0x310
> alloc_pool_huge_page+0x102/0x120
> set_max_huge_pages+0x13e/0x350
> hugetlb_sysctl_handler_common+0xd8/0x110
> hugetlb_sysctl_handler+0x48/0x58
> proc_sys_call_handler+0x138/0x238
> new_sync_write+0x10e/0x198
> vfs_write.part.0+0x12c/0x238
> ksys_write+0x68/0xf8
> do_syscall+0x82/0xd0
> __do_syscall+0xb4/0xc8
> system_call+0x72/0x98
>
> This is a result of the change which moved the hugetlb page subpool
> pointer from page->private to page[1]->private. When new pages are
> allocated from the buddy allocator, the private field of the head
> page will be cleared, but the private field of subpages is not modified.
> Therefore, old values may remain.

Very interesting. I have expected that the page->private would be in a
reasonable state when allocated. On the other hand hugetlb doesn't do
__GFP_COMP so tail pages are not initialized by the allocator.

> Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
>
> Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")

This is not a stable sha to refer to as it comes from linux next.

> Reported-by: Gerald Schaefer <[email protected]>
> Signed-off-by: Mike Kravetz <[email protected]>

Acked-by: Michal Hocko <[email protected]>

I think this would be worth a separate patch rather than having it
folded into the original patch. Thi is really subtle.

> ---
> mm/hugetlb.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c232cb67dda2..7ae5c18c98a7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1465,6 +1465,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> {
> INIT_LIST_HEAD(&page->lru);
> set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> + hugetlb_set_page_subpool(page, NULL);
> set_hugetlb_cgroup(page, NULL);
> set_hugetlb_cgroup_rsvd(page, NULL);
> spin_lock(&hugetlb_lock);
> --
> 2.29.2
>

--
Michal Hocko
SUSE Labs

2021-02-24 11:17:42

by Muchun Song

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On Wed, Feb 24, 2021 at 5:43 PM Michal Hocko <[email protected]> wrote:
>
> On Tue 23-02-21 13:55:44, Mike Kravetz wrote:
> > Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> > with linux-next 5.12.0-20210222.
> > Call trace:
> > hugepage_subpool_put_pages.part.0+0x2c/0x138
> > __free_huge_page+0xce/0x310
> > alloc_pool_huge_page+0x102/0x120
> > set_max_huge_pages+0x13e/0x350
> > hugetlb_sysctl_handler_common+0xd8/0x110
> > hugetlb_sysctl_handler+0x48/0x58
> > proc_sys_call_handler+0x138/0x238
> > new_sync_write+0x10e/0x198
> > vfs_write.part.0+0x12c/0x238
> > ksys_write+0x68/0xf8
> > do_syscall+0x82/0xd0
> > __do_syscall+0xb4/0xc8
> > system_call+0x72/0x98
> >
> > This is a result of the change which moved the hugetlb page subpool
> > pointer from page->private to page[1]->private. When new pages are
> > allocated from the buddy allocator, the private field of the head
> > page will be cleared, but the private field of subpages is not modified.
> > Therefore, old values may remain.
>
> Very interesting. I have expected that the page->private would be in a
> reasonable state when allocated. On the other hand hugetlb doesn't do
> __GFP_COMP so tail pages are not initialized by the allocator.

It seems that the buddy allocator does not initialize the private field
of the tail page even when we specify __GFP_COMP.


>
> > Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
> >
> > Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page flags")
>
> This is not a stable sha to refer to as it comes from linux next.
>
> > Reported-by: Gerald Schaefer <[email protected]>
> > Signed-off-by: Mike Kravetz <[email protected]>
>
> Acked-by: Michal Hocko <[email protected]>
>
> I think this would be worth a separate patch rather than having it
> folded into the original patch. Thi is really subtle.
>
> > ---
> > mm/hugetlb.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index c232cb67dda2..7ae5c18c98a7 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1465,6 +1465,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> > {
> > INIT_LIST_HEAD(&page->lru);
> > set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> > + hugetlb_set_page_subpool(page, NULL);
> > set_hugetlb_cgroup(page, NULL);
> > set_hugetlb_cgroup_rsvd(page, NULL);
> > spin_lock(&hugetlb_lock);
> > --
> > 2.29.2
> >
>
> --
> Michal Hocko
> SUSE Labs

2021-02-24 11:42:42

by Michal Hocko

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] hugetlb: fix uninitialized subpool pointer

On Wed 24-02-21 19:10:42, Muchun Song wrote:
> On Wed, Feb 24, 2021 at 5:43 PM Michal Hocko <[email protected]> wrote:
> >
> > On Tue 23-02-21 13:55:44, Mike Kravetz wrote:
> > > Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> > > with linux-next 5.12.0-20210222.
> > > Call trace:
> > > hugepage_subpool_put_pages.part.0+0x2c/0x138
> > > __free_huge_page+0xce/0x310
> > > alloc_pool_huge_page+0x102/0x120
> > > set_max_huge_pages+0x13e/0x350
> > > hugetlb_sysctl_handler_common+0xd8/0x110
> > > hugetlb_sysctl_handler+0x48/0x58
> > > proc_sys_call_handler+0x138/0x238
> > > new_sync_write+0x10e/0x198
> > > vfs_write.part.0+0x12c/0x238
> > > ksys_write+0x68/0xf8
> > > do_syscall+0x82/0xd0
> > > __do_syscall+0xb4/0xc8
> > > system_call+0x72/0x98
> > >
> > > This is a result of the change which moved the hugetlb page subpool
> > > pointer from page->private to page[1]->private. When new pages are
> > > allocated from the buddy allocator, the private field of the head
> > > page will be cleared, but the private field of subpages is not modified.
> > > Therefore, old values may remain.
> >
> > Very interesting. I have expected that the page->private would be in a
> > reasonable state when allocated. On the other hand hugetlb doesn't do
> > __GFP_COMP so tail pages are not initialized by the allocator.
>
> It seems that the buddy allocator does not initialize the private field
> of the tail page even when we specify __GFP_COMP.

Yes it doesn't. What I meant to say is that even if it did a lack of
__GFP_COMP would result in not doing so. I do not remember why hugetlb
doesn't use __GFP_COMP but I believe this was never the case.
--
Michal Hocko
SUSE Labs