2021-04-21 02:47:50

by Ding Hui

[permalink] [raw]
Subject: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

Recently we found there is a lot MemFree left in /proc/meminfo after
do a lot of pages soft offline.

I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
After take_page_off_buddy, the page is no longer belong to buddy
allocator, and will not be used any more, but we maybe missed accounting
NR_FREE_PAGES in this situation.

Signed-off-by: Ding Hui <[email protected]>
---
mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cfc72873961d..8d65b62784d8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
del_page_from_free_list(page_head, zone, page_order);
break_down_buddy_pages(zone, page_head, page, 0,
page_order, migratetype);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
ret = true;
break;
}
--
2.17.1


2021-04-28 17:38:11

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On 21.04.21 04:04, Ding Hui wrote:
> Recently we found there is a lot MemFree left in /proc/meminfo after
> do a lot of pages soft offline.
>
> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
> After take_page_off_buddy, the page is no longer belong to buddy
> allocator, and will not be used any more, but we maybe missed accounting
> NR_FREE_PAGES in this situation.
>
> Signed-off-by: Ding Hui <[email protected]>
> ---
> mm/page_alloc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cfc72873961d..8d65b62784d8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
> del_page_from_free_list(page_head, zone, page_order);
> break_down_buddy_pages(zone, page_head, page, 0,
> page_order, migratetype);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
> ret = true;
> break;
> }
>

Should this use __mod_zone_freepage_state() instead?


--
Thanks,

David / dhildenb

2021-04-30 09:44:22

by Ding Hui

[permalink] [raw]
Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On 2021/4/28 22:54, David Hildenbrand wrote:
> On 21.04.21 04:04, Ding Hui wrote:
>> Recently we found there is a lot MemFree left in /proc/meminfo after
>> do a lot of pages soft offline.
>>
>> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison
>> pages.
>> After take_page_off_buddy, the page is no longer belong to buddy
>> allocator, and will not be used any more, but we maybe missed accounting
>> NR_FREE_PAGES in this situation.
>>
>> Signed-off-by: Ding Hui <[email protected]>
>> ---
>>   mm/page_alloc.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cfc72873961d..8d65b62784d8 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
>>               del_page_from_free_list(page_head, zone, page_order);
>>               break_down_buddy_pages(zone, page_head, page, 0,
>>                           page_order, migratetype);
>> +            __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
>>               ret = true;
>>               break;
>>           }
>>
>
> Should this use __mod_zone_freepage_state() instead?
>

Yes, you're right.
I'll use it in v2.

--
Thanks,
- Ding Hui

Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On Wed, Apr 28, 2021 at 04:54:59PM +0200, David Hildenbrand wrote:
> On 21.04.21 04:04, Ding Hui wrote:
> > Recently we found there is a lot MemFree left in /proc/meminfo after
> > do a lot of pages soft offline.
> >
> > I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
> > After take_page_off_buddy, the page is no longer belong to buddy
> > allocator, and will not be used any more, but we maybe missed accounting
> > NR_FREE_PAGES in this situation.
> >
> > Signed-off-by: Ding Hui <[email protected]>
> > ---
> > mm/page_alloc.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index cfc72873961d..8d65b62784d8 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
> > del_page_from_free_list(page_head, zone, page_order);
> > break_down_buddy_pages(zone, page_head, page, 0,
> > page_order, migratetype);
> > + __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
> > ret = true;
> > break;
> > }
> >
>
> Should this use __mod_zone_freepage_state() instead?

Yes, __mod_zone_freepage_state() looks better to me.

And I think that maybe an additional __mod_zone_freepage_state() in
unpoison_memory() is necessary to cancel the decrement. I thought of the
following, but it doesn't build because get_pfnblock_migratetype() is
available only in mm/page_alloc.c, so you might want to add a small exported
routine in mm/page_alloc.c and let it called from unpoison_memory().

@@ -1899,8 +1899,12 @@ int unpoison_memory(unsigned long pfn)
}

if (!get_hwpoison_page(p, flags, 0)) {
- if (TestClearPageHWPoison(p))
+ if (TestClearPageHWPoison(p)) {
+ int migratetype = get_pfnblock_migratetype(p, pfn);
+
num_poisoned_pages_dec();
+ __mod_zone_freepage_state(page_zone(p), 1, migratetype);
+ }
unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
pfn, &unpoison_rs);
return 0;

Thanks,
Naoya Horiguchi

2021-05-06 04:04:17

by Ding Hui

[permalink] [raw]
Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On 2021/5/6 10:49, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Wed, Apr 28, 2021 at 04:54:59PM +0200, David Hildenbrand wrote:
>> On 21.04.21 04:04, Ding Hui wrote:
>>> Recently we found there is a lot MemFree left in /proc/meminfo after
>>> do a lot of pages soft offline.
>>>
>>> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
>>> After take_page_off_buddy, the page is no longer belong to buddy
>>> allocator, and will not be used any more, but we maybe missed accounting
>>> NR_FREE_PAGES in this situation.
>>>
>>> Signed-off-by: Ding Hui <[email protected]>
>>> ---
>>> mm/page_alloc.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index cfc72873961d..8d65b62784d8 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
>>> del_page_from_free_list(page_head, zone, page_order);
>>> break_down_buddy_pages(zone, page_head, page, 0,
>>> page_order, migratetype);
>>> + __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
>>> ret = true;
>>> break;
>>> }
>>>
>>
>> Should this use __mod_zone_freepage_state() instead?
>
> Yes, __mod_zone_freepage_state() looks better to me.
>
> And I think that maybe an additional __mod_zone_freepage_state() in
> unpoison_memory() is necessary to cancel the decrement. I thought of the
> following, but it doesn't build because get_pfnblock_migratetype() is
> available only in mm/page_alloc.c, so you might want to add a small exported
> routine in mm/page_alloc.c and let it called from unpoison_memory().
>
> @@ -1899,8 +1899,12 @@ int unpoison_memory(unsigned long pfn)
> }
>
> if (!get_hwpoison_page(p, flags, 0)) {
> - if (TestClearPageHWPoison(p))
> + if (TestClearPageHWPoison(p)) {
> + int migratetype = get_pfnblock_migratetype(p, pfn);
> +
> num_poisoned_pages_dec();
> + __mod_zone_freepage_state(page_zone(p), 1, migratetype);
> + }
> unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
> pfn, &unpoison_rs);
> return 0;
>

I think there is another problem:
In normal case, we keep the last refcount of the hwpoison page, so
get_hwpoison_page should return 1. The NR_FREE_PAGES will be adjusted
when call put_page.
At race condition, we maybe leak the page because we does not put it
back to buddy in unpoison_memory, however the HWPoison flag,
num_poisoned_pages, NR_FREE_PAGES is adjusted correctly.

CPU0 CPU1

soft_offline_page
soft_offline_free_page
page_handle_poison
take_page_off_buddy
SetPageHWPoison
unpoison_memory
if (!get_hwpoison_page(p))
TestClearPageHWPoison
num_poisoned_pages_dec
__mod_zone_freepage_state
return 0
/* miss put the page back to buddy */
page_ref_inc
num_poisoned_pages_inc

How about do nothing and return -EBUSY (so the caller can retry) if
unpoison a zero refcount page , or return 0 like 230ac719c500
("mm/hwpoison: don't try to unpoison containment-failed pages") does ?

@@ -1736,11 +1736,9 @@ int unpoison_memory(unsigned long pfn)
}

if (!get_hwpoison_page(p, flags, 0)) {
- if (TestClearPageHWPoison(p))
- num_poisoned_pages_dec();
- unpoison_pr_info("Unpoison: Software-unpoisoned free page
%#lx\n",
+ unpoison_pr_info("Unpoison: Software-unpoisoned zero refcount
page %#lx\n",
pfn, &unpoison_rs);
- return 0;
+ return -EBUSY;
}

lock_page(page);



--
Thanks,
- Ding Hui

Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On Thu, May 06, 2021 at 12:01:34PM +0800, Ding Hui wrote:
> On 2021/5/6 10:49, HORIGUCHI NAOYA(堀口 直也) wrote:
> > On Wed, Apr 28, 2021 at 04:54:59PM +0200, David Hildenbrand wrote:
> > > On 21.04.21 04:04, Ding Hui wrote:
> > > > Recently we found there is a lot MemFree left in /proc/meminfo after
> > > > do a lot of pages soft offline.
> > > >
> > > > I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
> > > > After take_page_off_buddy, the page is no longer belong to buddy
> > > > allocator, and will not be used any more, but we maybe missed accounting
> > > > NR_FREE_PAGES in this situation.
> > > >
> > > > Signed-off-by: Ding Hui <[email protected]>
> > > > ---
> > > > mm/page_alloc.c | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > index cfc72873961d..8d65b62784d8 100644
> > > > --- a/mm/page_alloc.c
> > > > +++ b/mm/page_alloc.c
> > > > @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
> > > > del_page_from_free_list(page_head, zone, page_order);
> > > > break_down_buddy_pages(zone, page_head, page, 0,
> > > > page_order, migratetype);
> > > > + __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
> > > > ret = true;
> > > > break;
> > > > }
> > > >
> > >
> > > Should this use __mod_zone_freepage_state() instead?
> >
> > Yes, __mod_zone_freepage_state() looks better to me.
> >
> > And I think that maybe an additional __mod_zone_freepage_state() in
> > unpoison_memory() is necessary to cancel the decrement. I thought of the
> > following, but it doesn't build because get_pfnblock_migratetype() is
> > available only in mm/page_alloc.c, so you might want to add a small exported
> > routine in mm/page_alloc.c and let it called from unpoison_memory().
> >
> > @@ -1899,8 +1899,12 @@ int unpoison_memory(unsigned long pfn)
> > }
> > if (!get_hwpoison_page(p, flags, 0)) {
> > - if (TestClearPageHWPoison(p))
> > + if (TestClearPageHWPoison(p)) {
> > + int migratetype = get_pfnblock_migratetype(p, pfn);
> > +
> > num_poisoned_pages_dec();
> > + __mod_zone_freepage_state(page_zone(p), 1, migratetype);
> > + }
> > unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
> > pfn, &unpoison_rs);
> > return 0;
> >
>
> I think there is another problem:
> In normal case, we keep the last refcount of the hwpoison page, so
> get_hwpoison_page should return 1. The NR_FREE_PAGES will be adjusted when
> call put_page.

I think that take_page_off_buddy() should not be called for this case
(the error page have remaining refcount). So it seems that no need to
update NR_FREE_PAGES ?

> At race condition, we maybe leak the page because we does not put it back to
> buddy in unpoison_memory, however the HWPoison flag, num_poisoned_pages,
> NR_FREE_PAGES is adjusted correctly.
>
> CPU0 CPU1
>
> soft_offline_page
> soft_offline_free_page
> page_handle_poison
> take_page_off_buddy
> SetPageHWPoison
> unpoison_memory
> if (!get_hwpoison_page(p))
> TestClearPageHWPoison
> num_poisoned_pages_dec
> __mod_zone_freepage_state
> return 0
> /* miss put the page back to buddy */
> page_ref_inc
> num_poisoned_pages_inc

Thanks for checking this, unpoison_memory() is racy. Recently we are suggesting
to introduce mf_mutex by [1]. Although this patch is not merged to mainline yet,
but it could be used to prevent the above race too.

[1] https://lore.kernel.org/linux-mm/[email protected]/

>
> How about do nothing and return -EBUSY (so the caller can retry) if unpoison
> a zero refcount page , or return 0 like 230ac719c500 ("mm/hwpoison: don't
> try to unpoison containment-failed pages") does ?
>
> @@ -1736,11 +1736,9 @@ int unpoison_memory(unsigned long pfn)
> }
>
> if (!get_hwpoison_page(p, flags, 0)) {
> - if (TestClearPageHWPoison(p))
> - num_poisoned_pages_dec();
> - unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
> + unpoison_pr_info("Unpoison: Software-unpoisoned zero refcount page
> %#lx\n",
> pfn, &unpoison_rs);
> - return 0;
> + return -EBUSY;

Currently unpoison_memory() does not work as reverse operation of take_page_off_buddy()
(it's simply broken), so implementing it at one time would be better.
I'll take time to fix unpoison_memory().

Thanks,
Naoya Horiguchi

2021-05-07 02:14:54

by Ding Hui

[permalink] [raw]
Subject: Re: [RFC PATCH] mm/page_alloc: fix counting of free pages after take off from buddy

On 2021/5/6 15:30, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Thu, May 06, 2021 at 12:01:34PM +0800, Ding Hui wrote:
>> On 2021/5/6 10:49, HORIGUCHI NAOYA(堀口 直也) wrote:
>>> On Wed, Apr 28, 2021 at 04:54:59PM +0200, David Hildenbrand wrote:
>>>> On 21.04.21 04:04, Ding Hui wrote:
>>>>> Recently we found there is a lot MemFree left in /proc/meminfo after
>>>>> do a lot of pages soft offline.
>>>>>
>>>>> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
>>>>> After take_page_off_buddy, the page is no longer belong to buddy
>>>>> allocator, and will not be used any more, but we maybe missed accounting
>>>>> NR_FREE_PAGES in this situation.
>>>>>
>>>>> Signed-off-by: Ding Hui <[email protected]>
>>>>> ---
>>>>> mm/page_alloc.c | 1 +
>>>>> 1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>> index cfc72873961d..8d65b62784d8 100644
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
>>>>> del_page_from_free_list(page_head, zone, page_order);
>>>>> break_down_buddy_pages(zone, page_head, page, 0,
>>>>> page_order, migratetype);
>>>>> + __mod_zone_page_state(zone, NR_FREE_PAGES, -1);
>>>>> ret = true;
>>>>> break;
>>>>> }
>>>>>
>>>>
>>>> Should this use __mod_zone_freepage_state() instead?
>>>
>>> Yes, __mod_zone_freepage_state() looks better to me.
>>>
>>> And I think that maybe an additional __mod_zone_freepage_state() in
>>> unpoison_memory() is necessary to cancel the decrement. I thought of the
>>> following, but it doesn't build because get_pfnblock_migratetype() is
>>> available only in mm/page_alloc.c, so you might want to add a small exported
>>> routine in mm/page_alloc.c and let it called from unpoison_memory().
>>>
>>> @@ -1899,8 +1899,12 @@ int unpoison_memory(unsigned long pfn)
>>> }
>>> if (!get_hwpoison_page(p, flags, 0)) {
>>> - if (TestClearPageHWPoison(p))
>>> + if (TestClearPageHWPoison(p)) {
>>> + int migratetype = get_pfnblock_migratetype(p, pfn);
>>> +
>>> num_poisoned_pages_dec();
>>> + __mod_zone_freepage_state(page_zone(p), 1, migratetype);
>>> + }
>>> unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
>>> pfn, &unpoison_rs);
>>> return 0;
>>>
>>
>> I think there is another problem:
>> In normal case, we keep the last refcount of the hwpoison page, so
>> get_hwpoison_page should return 1. The NR_FREE_PAGES will be adjusted when
>> call put_page.
>
> I think that take_page_off_buddy() should not be called for this case
> (the error page have remaining refcount). So it seems that no need to
> update NR_FREE_PAGES ?
>

Yes, take_page_off_buddy() only used for free pages, but we will call
page_ref_inc() after that, on the other hand for in used pages, we
increased the refcount by get_any_page(), so in both cases, the
hwpoisoned pages have refcount great than zero.

I think there is no need to update NR_FREE_PAGES explicitly in
unpoison_memory(), the put_page() will help us to update NR_FREE_PAGES
and put the page back to buddy.

>> At race condition, we maybe leak the page because we does not put it back to
>> buddy in unpoison_memory, however the HWPoison flag, num_poisoned_pages,
>> NR_FREE_PAGES is adjusted correctly.
>>
>> CPU0 CPU1
>>
>> soft_offline_page
>> soft_offline_free_page
>> page_handle_poison
>> take_page_off_buddy
>> SetPageHWPoison
>> unpoison_memory
>> if (!get_hwpoison_page(p))
>> TestClearPageHWPoison
>> num_poisoned_pages_dec
>> __mod_zone_freepage_state
>> return 0
>> /* miss put the page back to buddy */
>> page_ref_inc
>> num_poisoned_pages_inc
>
> Thanks for checking this, unpoison_memory() is racy. Recently we are suggesting
> to introduce mf_mutex by [1]. Although this patch is not merged to mainline yet,
> but it could be used to prevent the above race too.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
>

I'll look forward to it, thanks.

>>
>> How about do nothing and return -EBUSY (so the caller can retry) if unpoison
>> a zero refcount page , or return 0 like 230ac719c500 ("mm/hwpoison: don't
>> try to unpoison containment-failed pages") does ?
>>
>> @@ -1736,11 +1736,9 @@ int unpoison_memory(unsigned long pfn)
>> }
>>
>> if (!get_hwpoison_page(p, flags, 0)) {
>> - if (TestClearPageHWPoison(p))
>> - num_poisoned_pages_dec();
>> - unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n",
>> + unpoison_pr_info("Unpoison: Software-unpoisoned zero refcount page
>> %#lx\n",
>> pfn, &unpoison_rs);
>> - return 0;
>> + return -EBUSY;
>
> Currently unpoison_memory() does not work as reverse operation of take_page_off_buddy()
> (it's simply broken), so implementing it at one time would be better.
> I'll take time to fix unpoison_memory().
>

Thanks for your work.
Actually, I'm not sure about the exactly meaning of "broken", it seems
that the basic function of unpoison_memory() is ok if not considered the
race conditions.


--
Thanks,
- Ding Hui

2021-05-08 03:58:07

by Ding Hui

[permalink] [raw]
Subject: [PATCH v2] mm/page_alloc: fix counting of free pages after take off from buddy

Recently we found there is a lot MemFree left in /proc/meminfo after
do a lot of pages soft offline.

I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
For offline free pages, after a successful call take_page_off_buddy(), the
page is no longer belong to buddy allocator, and will not be used any more,
but we missed accounting NR_FREE_PAGES in this situation.

Do update like rmqueue() does.

Signed-off-by: Ding Hui <[email protected]>
---
V2:
use __mod_zone_freepage_state instead of __mod_zone_page_state

mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cfc72873961d..e124a615303b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
del_page_from_free_list(page_head, zone, page_order);
break_down_buddy_pages(zone, page_head, page, 0,
page_order, migratetype);
+ __mod_zone_freepage_state(zone, -1, migratetype);
ret = true;
break;
}
--
2.17.1

Subject: Re: [PATCH v2] mm/page_alloc: fix counting of free pages after take off from buddy

On Sat, May 08, 2021 at 11:55:33AM +0800, Ding Hui wrote:
> Recently we found there is a lot MemFree left in /proc/meminfo after
> do a lot of pages soft offline.
>
> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
> For offline free pages, after a successful call take_page_off_buddy(), the
> page is no longer belong to buddy allocator, and will not be used any more,
> but we missed accounting NR_FREE_PAGES in this situation.
>
> Do update like rmqueue() does.
>
> Signed-off-by: Ding Hui <[email protected]>
> ---
> V2:
> use __mod_zone_freepage_state instead of __mod_zone_page_state
>
> mm/page_alloc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cfc72873961d..e124a615303b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
> del_page_from_free_list(page_head, zone, page_order);
> break_down_buddy_pages(zone, page_head, page, 0,
> page_order, migratetype);
> + __mod_zone_freepage_state(zone, -1, migratetype);

Page offline code (see set_migratetype_isolate()) seems to handle
NR_FREE_PAGES counter in its own way, so I think that it's more correct to
call __mod_zone_freepage_state() only when is_migrate_isolate(migratetype))
is false.

Otherwise, the patch looks good to me.

Thanks,
Naoya Horiguchi

> ret = true;
> break;
> }
> --
> 2.17.1
>

2021-05-26 02:46:11

by Ding Hui

[permalink] [raw]
Subject: Re: [PATCH v2] mm/page_alloc: fix counting of free pages after take off from buddy

On 2021/5/25 16:32, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Sat, May 08, 2021 at 11:55:33AM +0800, Ding Hui wrote:
>> Recently we found there is a lot MemFree left in /proc/meminfo after
>> do a lot of pages soft offline.
>>
>> I think it's incorrect since NR_FREE_PAGES should not contain HWPoison pages.
>> For offline free pages, after a successful call take_page_off_buddy(), the
>> page is no longer belong to buddy allocator, and will not be used any more,
>> but we missed accounting NR_FREE_PAGES in this situation.
>>
>> Do update like rmqueue() does.
>>
>> Signed-off-by: Ding Hui <[email protected]>
>> ---
>> V2:
>> use __mod_zone_freepage_state instead of __mod_zone_page_state
>>
>> mm/page_alloc.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cfc72873961d..e124a615303b 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -8947,6 +8947,7 @@ bool take_page_off_buddy(struct page *page)
>> del_page_from_free_list(page_head, zone, page_order);
>> break_down_buddy_pages(zone, page_head, page, 0,
>> page_order, migratetype);
>> + __mod_zone_freepage_state(zone, -1, migratetype);
>
> Page offline code (see set_migratetype_isolate()) seems to handle
> NR_FREE_PAGES counter in its own way, so I think that it's more correct to
> call __mod_zone_freepage_state() only when is_migrate_isolate(migratetype))
> is false.
>
> Otherwise, the patch looks good to me.
>

Thanks for reply and suggestion, I'll send v3 patch later.

>> ret = true;
>> break;
>> }
>> --
>> 2.17.1


--
Thanks,
- Ding Hui