Following are some cleanup for hugetlb.
Simple test with tools/testing/selftests/vm/map_hugetlb pass.
Wei Yang (10):
mm/hugetlb: not necessary to coalesce regions recursively
mm/hugetlb: make sure to get NULL when list is empty
mm/hugetlb: use list_splice to merge two list at once
mm/hugetlb: count file_region to be added when regions_needed != NULL
mm/hugetlb: remove the redundant check on non_swap_entry()
mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
mm/hugetlb: a page from buddy is not on any list
mm/hugetlb: return non-isolated page in the loop instead of break and
check
mm/hugetlb: narrow the hugetlb_lock protection area during preparing
huge page
mm/hugetlb: not necessary to abuse temporary page to workaround the
nasty free_huge_page
mm/hugetlb.c | 101 ++++++++++++++++++++++-----------------------------
1 file changed, 44 insertions(+), 57 deletions(-)
--
2.20.1 (Apple Git-117)
list_first_entry() may not return NULL even when the list is empty.
Let's make sure the behavior by using list_first_entry_or_null(),
otherwise it would corrupt the list.
Signed-off-by: Wei Yang <[email protected]>
---
mm/hugetlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 62ec74f6d03f..0a2f3851b828 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
VM_BUG_ON(resv->region_cache_count <= 0);
resv->region_cache_count--;
- nrg = list_first_entry(&resv->region_cache, struct file_region, link);
+ nrg = list_first_entry_or_null(&resv->region_cache,
+ struct file_region, link);
VM_BUG_ON(!nrg);
list_del(&nrg->link);
--
2.20.1 (Apple Git-117)
Before proper processing, huge_pte_alloc() would be called
un-conditionally. It is not necessary to do this when ptep is NULL.
Signed-off-by: Wei Yang <[email protected]>
---
mm/hugetlb.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f5f04e89000d..fb09e5a83c39 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4534,10 +4534,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
VM_FAULT_SET_HINDEX(hstate_index(h));
- } else {
- ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
- if (!ptep)
- return VM_FAULT_OOM;
}
/*
--
2.20.1 (Apple Git-117)
Migration and hwpoison entry is a subset of non_swap_entry().
Remove the redundant check on non_swap_entry().
Signed-off-by: Wei Yang <[email protected]>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d775e514eb2e..f5f04e89000d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
if (huge_pte_none(pte) || pte_present(pte))
return false;
swp = pte_to_swp_entry(pte);
- if (non_swap_entry(swp) && is_migration_entry(swp))
+ if (is_migration_entry(swp))
return true;
else
return false;
@@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
if (huge_pte_none(pte) || pte_present(pte))
return 0;
swp = pte_to_swp_entry(pte);
- if (non_swap_entry(swp) && is_hwpoison_entry(swp))
+ if (is_hwpoison_entry(swp))
return 1;
else
return 0;
--
2.20.1 (Apple Git-117)
Per my understanding, we keep the regions ordered and would always
coalesce regions properly. So the task to keep this property is just
to coalesce its neighbour.
Let's simplify this.
Signed-off-by: Wei Yang <[email protected]>
---
mm/hugetlb.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 590111ea6975..62ec74f6d03f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -307,8 +307,7 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
list_del(&rg->link);
kfree(rg);
- coalesce_file_region(resv, prg);
- return;
+ rg = prg;
}
nrg = list_next_entry(rg, link);
@@ -318,9 +317,6 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
list_del(&rg->link);
kfree(rg);
-
- coalesce_file_region(resv, nrg);
- return;
}
}
--
2.20.1 (Apple Git-117)
On 08/07/20 at 05:12pm, Wei Yang wrote:
> Per my understanding, we keep the regions ordered and would always
> coalesce regions properly. So the task to keep this property is just
> to coalesce its neighbour.
>
> Let's simplify this.
>
> Signed-off-by: Wei Yang <[email protected]>
> ---
> mm/hugetlb.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 590111ea6975..62ec74f6d03f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -307,8 +307,7 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
> list_del(&rg->link);
> kfree(rg);
>
> - coalesce_file_region(resv, prg);
> - return;
> + rg = prg;
> }
>
> nrg = list_next_entry(rg, link);
> @@ -318,9 +317,6 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
>
> list_del(&rg->link);
> kfree(rg);
> -
> - coalesce_file_region(resv, nrg);
I agree with the change. But this change the original behaviour of
coalesce_file_region, not sure if there's any reason we need to do that,
maybe Mike can give a judgement. Personally,
Reviewed-by: Baoquan He <[email protected]>
> - return;
> }
> }
>
> --
> 2.20.1 (Apple Git-117)
>
>
On 08/07/20 at 05:12pm, Wei Yang wrote:
> list_first_entry() may not return NULL even when the list is empty.
>
> Let's make sure the behavior by using list_first_entry_or_null(),
> otherwise it would corrupt the list.
>
> Signed-off-by: Wei Yang <[email protected]>
> ---
> mm/hugetlb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 62ec74f6d03f..0a2f3851b828 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
> VM_BUG_ON(resv->region_cache_count <= 0);
We have had above line, is it possible to be NULL from list_first_entry?
>
> resv->region_cache_count--;
> - nrg = list_first_entry(&resv->region_cache, struct file_region, link);
> + nrg = list_first_entry_or_null(&resv->region_cache,
> + struct file_region, link);
> VM_BUG_ON(!nrg);
> list_del(&nrg->link);
>
> --
> 2.20.1 (Apple Git-117)
>
>
On 08/07/20 at 05:12pm, Wei Yang wrote:
> Migration and hwpoison entry is a subset of non_swap_entry().
>
> Remove the redundant check on non_swap_entry().
>
> Signed-off-by: Wei Yang <[email protected]>
Hmm, I have posted one patch to do the same thing, got reivewed by
people.
https://lore.kernel.org/linux-mm/20200723104636.GS32539@MiWiFi-R3L-srv/
> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index d775e514eb2e..f5f04e89000d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return false;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_migration_entry(swp))
> + if (is_migration_entry(swp))
> return true;
> else
> return false;
> @@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return 0;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> + if (is_hwpoison_entry(swp))
> return 1;
> else
> return 0;
> --
> 2.20.1 (Apple Git-117)
>
>
On 08/07/20 at 05:12pm, Wei Yang wrote:
> Before proper processing, huge_pte_alloc() would be called
> un-conditionally. It is not necessary to do this when ptep is NULL.
>
> Signed-off-by: Wei Yang <[email protected]>
> ---
> mm/hugetlb.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f5f04e89000d..fb09e5a83c39 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4534,10 +4534,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
> return VM_FAULT_HWPOISON_LARGE |
> VM_FAULT_SET_HINDEX(hstate_index(h));
> - } else {
> - ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
> - if (!ptep)
> - return VM_FAULT_OOM;
Right, seems a relic from Mike's i_mmap_rwsem handling patches.
Reviewed-by: Baoquan He <[email protected]>
> }
>
> /*
> --
> 2.20.1 (Apple Git-117)
>
>
On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>On 08/07/20 at 05:12pm, Wei Yang wrote:
>> list_first_entry() may not return NULL even when the list is empty.
>>
>> Let's make sure the behavior by using list_first_entry_or_null(),
>> otherwise it would corrupt the list.
>>
>> Signed-off-by: Wei Yang <[email protected]>
>> ---
>> mm/hugetlb.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 62ec74f6d03f..0a2f3851b828 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>> VM_BUG_ON(resv->region_cache_count <= 0);
>
>
>We have had above line, is it possible to be NULL from list_first_entry?
>
>>
>> resv->region_cache_count--;
>> - nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>> + nrg = list_first_entry_or_null(&resv->region_cache,
>> + struct file_region, link);
>> VM_BUG_ON(!nrg);
Or we can remove this VM_BUG_ON()?
>> list_del(&nrg->link);
>>
>> --
>> 2.20.1 (Apple Git-117)
>>
>>
--
Wei Yang
Help you, Help me
On Fri, Aug 07, 2020 at 08:55:50PM +0800, Baoquan He wrote:
>On 08/07/20 at 05:12pm, Wei Yang wrote:
>> Migration and hwpoison entry is a subset of non_swap_entry().
>>
>> Remove the redundant check on non_swap_entry().
>>
>> Signed-off-by: Wei Yang <[email protected]>
>
>Hmm, I have posted one patch to do the same thing, got reivewed by
>people.
>
>https://lore.kernel.org/linux-mm/20200723104636.GS32539@MiWiFi-R3L-srv/
>
Nice
>> ---
>> mm/hugetlb.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index d775e514eb2e..f5f04e89000d 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
>> if (huge_pte_none(pte) || pte_present(pte))
>> return false;
>> swp = pte_to_swp_entry(pte);
>> - if (non_swap_entry(swp) && is_migration_entry(swp))
>> + if (is_migration_entry(swp))
>> return true;
>> else
>> return false;
>> @@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
>> if (huge_pte_none(pte) || pte_present(pte))
>> return 0;
>> swp = pte_to_swp_entry(pte);
>> - if (non_swap_entry(swp) && is_hwpoison_entry(swp))
>> + if (is_hwpoison_entry(swp))
>> return 1;
>> else
>> return 0;
>> --
>> 2.20.1 (Apple Git-117)
>>
>>
--
Wei Yang
Help you, Help me
On 8/7/20 2:12 AM, Wei Yang wrote:
> Following are some cleanup for hugetlb.
>
> Simple test with tools/testing/selftests/vm/map_hugetlb pass.
>
> Wei Yang (10):
> mm/hugetlb: not necessary to coalesce regions recursively
> mm/hugetlb: make sure to get NULL when list is empty
> mm/hugetlb: use list_splice to merge two list at once
> mm/hugetlb: count file_region to be added when regions_needed != NULL
> mm/hugetlb: remove the redundant check on non_swap_entry()
> mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
> mm/hugetlb: a page from buddy is not on any list
> mm/hugetlb: return non-isolated page in the loop instead of break and
> check
> mm/hugetlb: narrow the hugetlb_lock protection area during preparing
> huge page
> mm/hugetlb: not necessary to abuse temporary page to workaround the
> nasty free_huge_page
>
> mm/hugetlb.c | 101 ++++++++++++++++++++++-----------------------------
> 1 file changed, 44 insertions(+), 57 deletions(-)
Thanks Wei Yang!
I'll take a look at these next week.
--
Mike Kravetz
On 08/07/20 at 10:28pm, Wei Yang wrote:
> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
> >On 08/07/20 at 05:12pm, Wei Yang wrote:
> >> list_first_entry() may not return NULL even when the list is empty.
> >>
> >> Let's make sure the behavior by using list_first_entry_or_null(),
> >> otherwise it would corrupt the list.
> >>
> >> Signed-off-by: Wei Yang <[email protected]>
> >> ---
> >> mm/hugetlb.c | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index 62ec74f6d03f..0a2f3851b828 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
> >> VM_BUG_ON(resv->region_cache_count <= 0);
> >
> >
> >We have had above line, is it possible to be NULL from list_first_entry?
> >
> >>
> >> resv->region_cache_count--;
> >> - nrg = list_first_entry(&resv->region_cache, struct file_region, link);
> >> + nrg = list_first_entry_or_null(&resv->region_cache,
> >> + struct file_region, link);
> >> VM_BUG_ON(!nrg);
>
> Or we can remove this VM_BUG_ON()?
Yeah, it's fine to me.
>
> >> list_del(&nrg->link);
> >>
> >> --
> >> 2.20.1 (Apple Git-117)
> >>
> >>
>
> --
> Wei Yang
> Help you, Help me
>
On 8/7/20 2:12 AM, Wei Yang wrote:
> Per my understanding, we keep the regions ordered and would always
> coalesce regions properly. So the task to keep this property is just
> to coalesce its neighbour.
>
> Let's simplify this.
>
> Signed-off-by: Wei Yang <[email protected]>
Thanks! It is unfortunate that the region management code is difficult
to understand.
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz
On 8/7/20 7:28 AM, Wei Yang wrote:
> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>> On 08/07/20 at 05:12pm, Wei Yang wrote:
>>> list_first_entry() may not return NULL even when the list is empty.
>>>
>>> Let's make sure the behavior by using list_first_entry_or_null(),
>>> otherwise it would corrupt the list.
>>>
>>> Signed-off-by: Wei Yang <[email protected]>
>>> ---
>>> mm/hugetlb.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 62ec74f6d03f..0a2f3851b828 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>>> VM_BUG_ON(resv->region_cache_count <= 0);
>>
>>
>> We have had above line, is it possible to be NULL from list_first_entry?
>>
>>>
>>> resv->region_cache_count--;
>>> - nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>>> + nrg = list_first_entry_or_null(&resv->region_cache,
>>> + struct file_region, link);
>>> VM_BUG_ON(!nrg);
>
> Or we can remove this VM_BUG_ON()?
>
I would prefer that we just remove the 'VM_BUG_ON(!nrg)'. Code elsewhere
is responsible for making sure there is ALWAYS an entry in the cache. That
is why the 'VM_BUG_ON(resv->region_cache_count <= 0)' is at the beginning
of the routine.
--
Mike Kravetz
On 8/7/20 2:12 AM, Wei Yang wrote:
> Before proper processing, huge_pte_alloc() would be called
> un-conditionally. It is not necessary to do this when ptep is NULL.
Worse, that extra call is a bug. I believe Andrew pulled this patch into
his queue. It still could use a review.
https://lore.kernel.org/linux-mm/[email protected]/
--
Mike Kravetz
On Mon, Aug 10, 2020 at 01:28:46PM -0700, Mike Kravetz wrote:
>On 8/7/20 7:28 AM, Wei Yang wrote:
>> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>>> On 08/07/20 at 05:12pm, Wei Yang wrote:
>>>> list_first_entry() may not return NULL even when the list is empty.
>>>>
>>>> Let's make sure the behavior by using list_first_entry_or_null(),
>>>> otherwise it would corrupt the list.
>>>>
>>>> Signed-off-by: Wei Yang <[email protected]>
>>>> ---
>>>> mm/hugetlb.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 62ec74f6d03f..0a2f3851b828 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>>>> VM_BUG_ON(resv->region_cache_count <= 0);
>>>
>>>
>>> We have had above line, is it possible to be NULL from list_first_entry?
>>>
>>>>
>>>> resv->region_cache_count--;
>>>> - nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>>>> + nrg = list_first_entry_or_null(&resv->region_cache,
>>>> + struct file_region, link);
>>>> VM_BUG_ON(!nrg);
>>
>> Or we can remove this VM_BUG_ON()?
>>
>
>I would prefer that we just remove the 'VM_BUG_ON(!nrg)'. Code elsewhere
>is responsible for making sure there is ALWAYS an entry in the cache. That
>is why the 'VM_BUG_ON(resv->region_cache_count <= 0)' is at the beginning
>of the routine.
Sure, will change to this.
>
>--
>Mike Kravetz
--
Wei Yang
Help you, Help me