2021-04-13 15:25:34

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Tue 13-04-21 12:47:43, Oscar Salvador wrote:
> Currently, the clearing of the flag is done under the lock, but this
> is unnecessary as we just allocated the page and we did not give it
> away yet, so no one should be messing with it.
>
> Also, this helps making clear that here the lock is only protecting the
> counter.

While moving the flag clearing is ok I am wondering why do we need that
in the first place. I think it is just a leftover from 6c0371490140
("hugetlb: convert PageHugeFreed to HPageFreed flag"). Prior to that a tail
page as been used to keep track of the state but now all happens in the
head page and the flag uses page->private which is always initialized
when allocated by the allocator (post_alloc_hook).

Or do we need it for giga pages which are not allocated by the page
allocator? If yes then moving it to prep_compound_gigantic_page would be
better.

So should we just drop it here?

> Signed-off-by: Oscar Salvador <[email protected]>
> ---
> mm/hugetlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 54d81d5947ed..e40d5fe5c63c 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1490,10 +1490,10 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> hugetlb_set_page_subpool(page, NULL);
> set_hugetlb_cgroup(page, NULL);
> set_hugetlb_cgroup_rsvd(page, NULL);
> + ClearHPageFreed(page);
> spin_lock_irq(&hugetlb_lock);
> h->nr_huge_pages++;
> h->nr_huge_pages_node[nid]++;
> - ClearHPageFreed(page);
> spin_unlock_irq(&hugetlb_lock);
> }
>
> --
> 2.16.3

--
Michal Hocko
SUSE Labs


2021-04-14 08:19:02

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On 4/13/21 6:23 AM, Michal Hocko wrote:
> On Tue 13-04-21 12:47:43, Oscar Salvador wrote:
>> Currently, the clearing of the flag is done under the lock, but this
>> is unnecessary as we just allocated the page and we did not give it
>> away yet, so no one should be messing with it.
>>
>> Also, this helps making clear that here the lock is only protecting the
>> counter.
>
> While moving the flag clearing is ok I am wondering why do we need that
> in the first place. I think it is just a leftover from 6c0371490140
> ("hugetlb: convert PageHugeFreed to HPageFreed flag"). Prior to that a tail
> page as been used to keep track of the state but now all happens in the
> head page and the flag uses page->private which is always initialized
> when allocated by the allocator (post_alloc_hook).

Yes, this was just left over from 6c0371490140. And, as you mention
post_alloc_hook will clear page->private for all non-gigantic pages
allocated via buddy.

> Or do we need it for giga pages which are not allocated by the page
> allocator? If yes then moving it to prep_compound_gigantic_page would be
> better.

I am pretty sure dynamically allocated giga pages have page->Private
cleared as well. It is not obvious, but the alloc_contig_range code
used to put together giga pages will end up calling isolate_freepages_range
that will call split_map_pages and then post_alloc_hook for each MAX_ORDER
block. As mentioned, this is not obvious and I would hate to rely on this
behavior not changing.

>
> So should we just drop it here?

The only place where page->private may not be initialized is when we do
allocations at boot time from memblock. In this case, we will add the
pages to the free list via put_page/free_huge_page so the appropriate
flags will be cleared before anyone notices.

I'm wondering if we should just do a set_page_private(page, 0) here in
prep_new_huge_page since we now use that field for flags. Or, is that
overkill?
--
Mike Kravetz

2021-04-14 13:48:43

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Tue 13-04-21 14:19:03, Mike Kravetz wrote:
> On 4/13/21 6:23 AM, Michal Hocko wrote:
> > On Tue 13-04-21 12:47:43, Oscar Salvador wrote:
[...]
> > Or do we need it for giga pages which are not allocated by the page
> > allocator? If yes then moving it to prep_compound_gigantic_page would be
> > better.
>
> I am pretty sure dynamically allocated giga pages have page->Private
> cleared as well. It is not obvious, but the alloc_contig_range code
> used to put together giga pages will end up calling isolate_freepages_range
> that will call split_map_pages and then post_alloc_hook for each MAX_ORDER
> block.

Thanks for saving me from crawling that code.

> As mentioned, this is not obvious and I would hate to rely on this
> behavior not changing.

Thinking about it some more, having some (page granularity) allocator
not clearing page private would be a serious problem for anybody relying
on its state. So I am not sure this can change.

> > So should we just drop it here?
>
> The only place where page->private may not be initialized is when we do
> allocations at boot time from memblock. In this case, we will add the
> pages to the free list via put_page/free_huge_page so the appropriate
> flags will be cleared before anyone notices.

Pages allocated by the bootmem should be pre initialized from the boot,
no?

> I'm wondering if we should just do a set_page_private(page, 0) here in
> prep_new_huge_page since we now use that field for flags. Or, is that
> overkill?

I would rather not duplicate the work done by underlying allocators. I
do not think other users of the allocator want to do the same so why
should hugetlb be any different.
--
Michal Hocko
SUSE Labs

2021-04-14 14:32:25

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Wed, Apr 14, 2021 at 08:04:21AM +0200, Michal Hocko wrote:
> On Tue 13-04-21 14:19:03, Mike Kravetz wrote:
> > On 4/13/21 6:23 AM, Michal Hocko wrote:
> > The only place where page->private may not be initialized is when we do
> > allocations at boot time from memblock. In this case, we will add the
> > pages to the free list via put_page/free_huge_page so the appropriate
> > flags will be cleared before anyone notices.
>
> Pages allocated by the bootmem should be pre initialized from the boot,
> no?

I guess Mike means:

hugetlb_hstate_alloc_pages
alloc_bootmem_huge_page
__alloc_bootmem_huge_page
memblock_alloc_try_nid_raw

and AFAICS, memblock_alloc_try_nid_raw() does not zero the memory.

Then these pages are initialized in:

gather_bootmem_prealloc
prep_compound_huge_page
prep_new_huge_page

But as can be noticed, no one touches page->private when coming from that
path.

--
Oscar Salvador
SUSE L3

2021-04-14 14:51:23

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Wed 14-04-21 09:41:32, Oscar Salvador wrote:
> On Wed, Apr 14, 2021 at 08:04:21AM +0200, Michal Hocko wrote:
> > On Tue 13-04-21 14:19:03, Mike Kravetz wrote:
> > > On 4/13/21 6:23 AM, Michal Hocko wrote:
> > > The only place where page->private may not be initialized is when we do
> > > allocations at boot time from memblock. In this case, we will add the
> > > pages to the free list via put_page/free_huge_page so the appropriate
> > > flags will be cleared before anyone notices.
> >
> > Pages allocated by the bootmem should be pre initialized from the boot,
> > no?
>
> I guess Mike means:
>
> hugetlb_hstate_alloc_pages
> alloc_bootmem_huge_page
> __alloc_bootmem_huge_page
> memblock_alloc_try_nid_raw
>
> and AFAICS, memblock_alloc_try_nid_raw() does not zero the memory.

You are right it doesn't do it there. But all struct pages, even those
that are allocated by the bootmem allocator should initialize its struct
pages. They would be poisoned otherwise, right? I would have to look at
the exact code path but IIRC this should be around the time bootmem
allocator state transitions to the page allocator.
--
Michal Hocko
SUSE Labs

2021-04-14 23:09:49

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Wed, Apr 14, 2021 at 10:28:33AM +0200, Michal Hocko wrote:
> You are right it doesn't do it there. But all struct pages, even those
> that are allocated by the bootmem allocator should initialize its struct
> pages. They would be poisoned otherwise, right? I would have to look at
> the exact code path but IIRC this should be around the time bootmem
> allocator state transitions to the page allocator.

Ok, you are right.
struct pages are initialized a bit earlier through:

start_kernel
setup_arch
paging_init
zone_sizes_init
free_area_init
free_area_init_node
free_area_init_core
memmap_init_zone
memmap_init_range
__init_single_page

While the allocation of bootmem hugetlb happens

start_kernel
parse_args
...
hugepages_setup
...
hugetlb_hstate_alloc_pages
__alloc_bootmem_huge_page

which is after the setup_arch() call.

So by the time we get the page from __alloc_bootmem_huge_page(), fields are
zeroed.
I thought we might get in trouble because memblock_alloc_try_nid_raw() calls
page_init_poison() which poisons the chunk with 0xff,e.g:

[ 1.955471] boot: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
[ 1.955476] boot: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

but it seems that does not the memmap struct page.

I checked, and when we get there in __alloc_bootmem_huge_page, page->private is
still zeroed, so I guess it should be safe to assume that we do not really need
to clear the flag in __prep_new_huge_page() routine?

--
Oscar Salvador
SUSE L3

2021-04-14 23:10:47

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock

On Wed, Apr 14, 2021 at 12:01:47PM +0200, Oscar Salvador wrote:
> but it seems that does not the memmap struct page.
that sould read as "but it seems that that does not affect the memmap struct page"


--
Oscar Salvador
SUSE L3