2024-02-05 03:54:41

by Baolin Wang

[permalink] [raw]
Subject: [PATCH] mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb

When handling the freed hugetlb or in-use hugetlb, we should ignore the
failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
since we did not use the new allocated hugetlb in this 2 cases.

Signed-off-by: Baolin Wang <[email protected]>
---
mm/hugetlb.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9d996fe4ecd9..212ab331d355 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3042,9 +3042,8 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
* under the lock.
*/
new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
- if (!new_folio)
- return -ENOMEM;
- __prep_new_hugetlb_folio(h, new_folio);
+ if (new_folio)
+ __prep_new_hugetlb_folio(h, new_folio);

retry:
spin_lock_irq(&hugetlb_lock);
@@ -3075,6 +3074,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
cond_resched();
goto retry;
} else {
+ if (!new_folio) {
+ ret = -ENOMEM;
+ goto free_new;
+ }
+
/*
* Ok, old_folio is still a genuine free hugepage. Remove it from
* the freelist and decrease the counters. These will be
@@ -3102,9 +3106,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,

free_new:
spin_unlock_irq(&hugetlb_lock);
- /* Folio has a zero ref count, but needs a ref to be freed */
- folio_ref_unfreeze(new_folio, 1);
- update_and_free_hugetlb_folio(h, new_folio, false);
+ if (new_folio) {
+ /* Folio has a zero ref count, but needs a ref to be freed */
+ folio_ref_unfreeze(new_folio, 1);
+ update_and_free_hugetlb_folio(h, new_folio, false);
+ }

return ret;
}
--
2.39.3



2024-02-05 06:57:44

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH] mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb



> On Feb 5, 2024, at 11:54, Baolin Wang <[email protected]> wrote:
>
> When handling the freed hugetlb or in-use hugetlb, we should ignore the
> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
> since we did not use the new allocated hugetlb in this 2 cases.
>
> Signed-off-by: Baolin Wang <[email protected]>

OK. It is not a fix (I see a fix keyword in subject) but an
optimization for unnecessary-allocation cases. Thanks.

Reviewed-by: Muchun Song <[email protected]>


2024-02-05 08:29:07

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb



On 2/5/2024 2:56 PM, Muchun Song wrote:
>
>
>> On Feb 5, 2024, at 11:54, Baolin Wang <[email protected]> wrote:
>>
>> When handling the freed hugetlb or in-use hugetlb, we should ignore the
>> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
>> since we did not use the new allocated hugetlb in this 2 cases.
>>
>> Signed-off-by: Baolin Wang <[email protected]>
>
> OK. It is not a fix (I see a fix keyword in subject) but an
> optimization for unnecessary-allocation cases. Thanks.

Yes, better to change the subject to 'mm: hugetlb: improve the handling
of hugetlb allocation failure for freed or in-use hugetlb'

Andrew, could you help to change the subject line when you apply it? (or
you want a new version, please let me know) Thanks.

> Reviewed-by: Muchun Song <[email protected]>

Thanks for reviewing.

2024-02-05 09:31:40

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb

On Mon 05-02-24 11:54:17, Baolin Wang wrote:
> When handling the freed hugetlb or in-use hugetlb, we should ignore the
> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
> since we did not use the new allocated hugetlb in this 2 cases.
>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> mm/hugetlb.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9d996fe4ecd9..212ab331d355 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3042,9 +3042,8 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
> * under the lock.
> */
> new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> - if (!new_folio)
> - return -ENOMEM;
> - __prep_new_hugetlb_folio(h, new_folio);
> + if (new_folio)
> + __prep_new_hugetlb_folio(h, new_folio);

Is there any reason why you haven't moved the allocation to the only
branch that actually needs it? I know that we hold hugetlb lock but you
could have easily dropped the lock, allocate a page and then goto retry.
This would actually save an allocation.

Something like this:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ed1581b670d4..db5f72b94422 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3029,21 +3029,9 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
{
gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
int nid = folio_nid(old_folio);
- struct folio *new_folio;
+ struct folio *new_folio = NULL;
int ret = 0;

- /*
- * Before dissolving the folio, we need to allocate a new one for the
- * pool to remain stable. Here, we allocate the folio and 'prep' it
- * by doing everything but actually updating counters and adding to
- * the pool. This simplifies and let us do most of the processing
- * under the lock.
- */
- new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
- if (!new_folio)
- return -ENOMEM;
- __prep_new_hugetlb_folio(h, new_folio);
-
retry:
spin_lock_irq(&hugetlb_lock);
if (!folio_test_hugetlb(old_folio)) {
@@ -3073,6 +3061,15 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
cond_resched();
goto retry;
} else {
+
+ if (!new_folio) {
+ spin_unlock_irq(&hugetlb_lock);
+ new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
+ if (!new_folio)
+ return -ENOMEM;
+ __prep_new_hugetlb_folio(h, new_folio);
+ goto retry;
+ }
/*
* Ok, old_folio is still a genuine free hugepage. Remove it from
* the freelist and decrease the counters. These will be
@@ -3100,9 +3097,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,

free_new:
spin_unlock_irq(&hugetlb_lock);
- /* Folio has a zero ref count, but needs a ref to be freed */
- folio_ref_unfreeze(new_folio, 1);
- update_and_free_hugetlb_folio(h, new_folio, false);
+ if (new_folio) {
+ /* Folio has a zero ref count, but needs a ref to be freed */
+ folio_ref_unfreeze(new_folio, 1);
+ update_and_free_hugetlb_folio(h, new_folio, false);
+ }

return ret;
}
--
Michal Hocko
SUSE Labs

2024-02-05 13:03:39

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb



On 2/5/2024 5:31 PM, Michal Hocko wrote:
> On Mon 05-02-24 11:54:17, Baolin Wang wrote:
>> When handling the freed hugetlb or in-use hugetlb, we should ignore the
>> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
>> since we did not use the new allocated hugetlb in this 2 cases.
>>
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>> mm/hugetlb.c | 18 ++++++++++++------
>> 1 file changed, 12 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 9d996fe4ecd9..212ab331d355 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3042,9 +3042,8 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>> * under the lock.
>> */
>> new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
>> - if (!new_folio)
>> - return -ENOMEM;
>> - __prep_new_hugetlb_folio(h, new_folio);
>> + if (new_folio)
>> + __prep_new_hugetlb_folio(h, new_folio);
>
> Is there any reason why you haven't moved the allocation to the only
> branch that actually needs it? I know that we hold hugetlb lock but you

Nope, just did a simple patch to ignore the allocation failure.

> could have easily dropped the lock, allocate a page and then goto retry.
> This would actually save an allocation.

Yes, will do. Thanks.

> Something like this:
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ed1581b670d4..db5f72b94422 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3029,21 +3029,9 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
> {
> gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
> int nid = folio_nid(old_folio);
> - struct folio *new_folio;
> + struct folio *new_folio = NULL;
> int ret = 0;
>
> - /*
> - * Before dissolving the folio, we need to allocate a new one for the
> - * pool to remain stable. Here, we allocate the folio and 'prep' it
> - * by doing everything but actually updating counters and adding to
> - * the pool. This simplifies and let us do most of the processing
> - * under the lock.
> - */
> - new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> - if (!new_folio)
> - return -ENOMEM;
> - __prep_new_hugetlb_folio(h, new_folio);
> -
> retry:
> spin_lock_irq(&hugetlb_lock);
> if (!folio_test_hugetlb(old_folio)) {
> @@ -3073,6 +3061,15 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
> cond_resched();
> goto retry;
> } else {
> +
> + if (!new_folio) {
> + spin_unlock_irq(&hugetlb_lock);
> + new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> + if (!new_folio)
> + return -ENOMEM;
> + __prep_new_hugetlb_folio(h, new_folio);
> + goto retry;
> + }
> /*
> * Ok, old_folio is still a genuine free hugepage. Remove it from
> * the freelist and decrease the counters. These will be
> @@ -3100,9 +3097,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>
> free_new:
> spin_unlock_irq(&hugetlb_lock);
> - /* Folio has a zero ref count, but needs a ref to be freed */
> - folio_ref_unfreeze(new_folio, 1);
> - update_and_free_hugetlb_folio(h, new_folio, false);
> + if (new_folio) {
> + /* Folio has a zero ref count, but needs a ref to be freed */
> + folio_ref_unfreeze(new_folio, 1);
> + update_and_free_hugetlb_folio(h, new_folio, false);
> + }
>
> return ret;
> }