2020-04-29 22:59:34

by Yang Shi

[permalink] [raw]
Subject: [linux-next PATCH 1/2] mm: khugepaged: add exceed_max_ptes_* helpers

The max_ptes_{swap|none|shared} are defined to tune the behavior of
khugepaged. The are checked at a couple of places with open coding.
Replace the opencoding to exceed_pax_ptes_{swap|none_shared} helpers to
improve the readability.

Cc: Kirill A. Shutemov <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
mm/khugepaged.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a02a4c5..0c8d30b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -339,6 +339,21 @@ struct attribute_group khugepaged_attr_group = {
};
#endif /* CONFIG_SYSFS */

+static inline bool exceed_max_ptes_none(unsigned int *nr_ptes)
+{
+ return (++(*nr_ptes) > khugepaged_max_ptes_none);
+}
+
+static inline bool exceed_max_ptes_swap(unsigned int *nr_ptes)
+{
+ return (++(*nr_ptes) > khugepaged_max_ptes_swap);
+}
+
+static inline bool exceed_max_ptes_shared(unsigned int *nr_ptes)
+{
+ return (++(*nr_ptes) > khugepaged_max_ptes_shared);
+}
+
int hugepage_madvise(struct vm_area_struct *vma,
unsigned long *vm_flags, int advice)
{
@@ -604,7 +619,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
if (pte_none(pteval) || (pte_present(pteval) &&
is_zero_pfn(pte_pfn(pteval)))) {
if (!userfaultfd_armed(vma) &&
- ++none_or_zero <= khugepaged_max_ptes_none) {
+ !exceed_max_ptes_none(&none_or_zero)) {
continue;
} else {
result = SCAN_EXCEED_NONE_PTE;
@@ -624,7 +639,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
VM_BUG_ON_PAGE(!PageAnon(page), page);

if (page_mapcount(page) > 1 &&
- ++shared > khugepaged_max_ptes_shared) {
+ exceed_max_ptes_shared(&shared)) {
result = SCAN_EXCEED_SHARED_PTE;
goto out;
}
@@ -1234,7 +1249,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
_pte++, _address += PAGE_SIZE) {
pte_t pteval = *_pte;
if (is_swap_pte(pteval)) {
- if (++unmapped <= khugepaged_max_ptes_swap) {
+ if (!exceed_max_ptes_swap(&unmapped)) {
/*
* Always be strict with uffd-wp
* enabled swap entries. Please see
@@ -1252,7 +1267,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
}
if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
if (!userfaultfd_armed(vma) &&
- ++none_or_zero <= khugepaged_max_ptes_none) {
+ !exceed_max_ptes_none(&none_or_zero)) {
continue;
} else {
result = SCAN_EXCEED_NONE_PTE;
@@ -1286,7 +1301,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
}

if (page_mapcount(page) > 1 &&
- ++shared > khugepaged_max_ptes_shared) {
+ exceed_max_ptes_shared(&shared)) {
result = SCAN_EXCEED_SHARED_PTE;
goto out_unmap;
}
@@ -1961,7 +1976,7 @@ static void khugepaged_scan_file(struct mm_struct *mm,
continue;

if (xa_is_value(page)) {
- if (++swap > khugepaged_max_ptes_swap) {
+ if (exceed_max_ptes_swap(&swap)) {
result = SCAN_EXCEED_SWAP_PTE;
break;
}
--
1.8.3.1


2020-04-29 22:59:55

by Yang Shi

[permalink] [raw]
Subject: [linux-next PATCH 2/2] mm: khugepaged: don't have to put being freed page back to lru

When khugepaged successfully isolated and copied data from base page to
collapsed THP, the base page is about to be freed. So putting the page
back to lru sounds not that productive since the page might be isolated
by vmscan but it can't be reclaimed by vmscan since it can't be unmapped
by try_to_unmap() at all.

Actually khugepaged is the last user of this page so it can be freed
directly. So, clearing active and unevictable flags, unlocking and
dropping refcount from isolate instead of calling putback_lru_page().

Cc: Kirill A. Shutemov <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
mm/khugepaged.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0c8d30b..c131a90 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -559,6 +559,17 @@ void __khugepaged_exit(struct mm_struct *mm)
static void release_pte_page(struct page *page)
{
mod_node_page_state(page_pgdat(page),
+ NR_ISOLATED_ANON + page_is_file_lru(page), -compound_nr(page));
+ ClearPageActive(page);
+ ClearPageUnevictable(page);
+ unlock_page(page);
+ /* Drop refcount from isolate */
+ put_page(page);
+}
+
+static void release_pte_page_to_lru(struct page *page)
+{
+ mod_node_page_state(page_pgdat(page),
NR_ISOLATED_ANON + page_is_file_lru(page),
-compound_nr(page));
unlock_page(page);
@@ -576,12 +587,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
page = pte_page(pteval);
if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
!PageCompound(page))
- release_pte_page(page);
+ release_pte_page_to_lru(page);
}

list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
list_del(&page->lru);
- release_pte_page(page);
+ release_pte_page_to_lru(page);
}
}

--
1.8.3.1

2020-04-30 00:45:47

by Yang Shi

[permalink] [raw]
Subject: Re: [linux-next PATCH 2/2] mm: khugepaged: don't have to put being freed page back to lru



On 4/29/20 3:56 PM, Yang Shi wrote:
> When khugepaged successfully isolated and copied data from base page to
> collapsed THP, the base page is about to be freed. So putting the page
> back to lru sounds not that productive since the page might be isolated
> by vmscan but it can't be reclaimed by vmscan since it can't be unmapped
> by try_to_unmap() at all.
>
> Actually khugepaged is the last user of this page so it can be freed
> directly. So, clearing active and unevictable flags, unlocking and
> dropping refcount from isolate instead of calling putback_lru_page().

Please disregard the patch. I just remembered Kirill added collapse
shared pages support. If the pages are shared then they have to be put
back to lru since they may be still mapped by other processes. So we
need check the mapcount if we would like to skip lru.

And I spotted the other issue. The release_pte_page() calls
mod_node_page_state() unconditionally, it was fine before. But, due to
the support for collapsing shared pages we need check if the last
mapcount is gone or not.

Andrew, would you please remove this patch from -mm tree? I will send
one or two rectified patches. Sorry for the inconvenience.

>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>
> ---
> mm/khugepaged.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 0c8d30b..c131a90 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -559,6 +559,17 @@ void __khugepaged_exit(struct mm_struct *mm)
> static void release_pte_page(struct page *page)
> {
> mod_node_page_state(page_pgdat(page),
> + NR_ISOLATED_ANON + page_is_file_lru(page), -compound_nr(page));
> + ClearPageActive(page);
> + ClearPageUnevictable(page);
> + unlock_page(page);
> + /* Drop refcount from isolate */
> + put_page(page);
> +}
> +
> +static void release_pte_page_to_lru(struct page *page)
> +{
> + mod_node_page_state(page_pgdat(page),
> NR_ISOLATED_ANON + page_is_file_lru(page),
> -compound_nr(page));
> unlock_page(page);
> @@ -576,12 +587,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
> page = pte_page(pteval);
> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
> !PageCompound(page))
> - release_pte_page(page);
> + release_pte_page_to_lru(page);
> }
>
> list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
> list_del(&page->lru);
> - release_pte_page(page);
> + release_pte_page_to_lru(page);
> }
> }
>

2020-04-30 00:51:39

by Yang Shi

[permalink] [raw]
Subject: Re: [linux-next PATCH 2/2] mm: khugepaged: don't have to put being freed page back to lru



On 4/29/20 5:41 PM, Yang Shi wrote:
>
>
> On 4/29/20 3:56 PM, Yang Shi wrote:
>> When khugepaged successfully isolated and copied data from base page to
>> collapsed THP, the base page is about to be freed.  So putting the page
>> back to lru sounds not that productive since the page might be isolated
>> by vmscan but it can't be reclaimed by vmscan since it can't be unmapped
>> by try_to_unmap() at all.
>>
>> Actually khugepaged is the last user of this page so it can be freed
>> directly.  So, clearing active and unevictable flags, unlocking and
>> dropping refcount from isolate instead of calling putback_lru_page().
>
> Please disregard the patch. I just remembered Kirill added collapse
> shared pages support. If the pages are shared then they have to be put
> back to lru since they may be still mapped by other processes. So we
> need check the mapcount if we would like to skip lru.
>
> And I spotted the other issue. The release_pte_page() calls
> mod_node_page_state() unconditionally, it was fine before. But, due to
> the support for collapsing shared pages we need check if the last
> mapcount is gone or not.

Hmm... this is false. I mixed up NR_ISOLATED_ANON and NR_ANON_MAPPED.

>
> Andrew, would you please remove this patch from -mm tree? I will send
> one or two rectified patches. Sorry for the inconvenience.
>
>>
>> Cc: Kirill A. Shutemov <[email protected]>
>> Cc: Hugh Dickins <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Signed-off-by: Yang Shi <[email protected]>
>> ---
>>   mm/khugepaged.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 0c8d30b..c131a90 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -559,6 +559,17 @@ void __khugepaged_exit(struct mm_struct *mm)
>>   static void release_pte_page(struct page *page)
>>   {
>>       mod_node_page_state(page_pgdat(page),
>> +        NR_ISOLATED_ANON + page_is_file_lru(page), -compound_nr(page));
>> +    ClearPageActive(page);
>> +    ClearPageUnevictable(page);
>> +    unlock_page(page);
>> +    /* Drop refcount from isolate */
>> +    put_page(page);
>> +}
>> +
>> +static void release_pte_page_to_lru(struct page *page)
>> +{
>> +    mod_node_page_state(page_pgdat(page),
>>               NR_ISOLATED_ANON + page_is_file_lru(page),
>>               -compound_nr(page));
>>       unlock_page(page);
>> @@ -576,12 +587,12 @@ static void release_pte_pages(pte_t *pte, pte_t
>> *_pte,
>>           page = pte_page(pteval);
>>           if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) &&
>>                   !PageCompound(page))
>> -            release_pte_page(page);
>> +            release_pte_page_to_lru(page);
>>       }
>>         list_for_each_entry_safe(page, tmp, compound_pagelist, lru) {
>>           list_del(&page->lru);
>> -        release_pte_page(page);
>> +        release_pte_page_to_lru(page);
>>       }
>>   }
>

2020-04-30 22:02:31

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [linux-next PATCH 1/2] mm: khugepaged: add exceed_max_ptes_* helpers

On Thu, Apr 30, 2020 at 06:56:21AM +0800, Yang Shi wrote:
> The max_ptes_{swap|none|shared} are defined to tune the behavior of
> khugepaged. The are checked at a couple of places with open coding.
> Replace the opencoding to exceed_pax_ptes_{swap|none_shared} helpers to
> improve the readability.
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>
> ---
> mm/khugepaged.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index a02a4c5..0c8d30b 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -339,6 +339,21 @@ struct attribute_group khugepaged_attr_group = {
> };
> #endif /* CONFIG_SYSFS */
>
> +static inline bool exceed_max_ptes_none(unsigned int *nr_ptes)
> +{
> + return (++(*nr_ptes) > khugepaged_max_ptes_none);
> +}
> +
> +static inline bool exceed_max_ptes_swap(unsigned int *nr_ptes)
> +{
> + return (++(*nr_ptes) > khugepaged_max_ptes_swap);
> +}
> +
> +static inline bool exceed_max_ptes_shared(unsigned int *nr_ptes)
> +{
> + return (++(*nr_ptes) > khugepaged_max_ptes_shared);
> +}
> +

Frankly, I find this ugly and confusing. Open-coded version is more
readable to me.

--
Kirill A. Shutemov

2020-05-01 00:01:57

by Yang Shi

[permalink] [raw]
Subject: Re: [linux-next PATCH 1/2] mm: khugepaged: add exceed_max_ptes_* helpers



On 4/30/20 2:59 PM, Kirill A. Shutemov wrote:
> On Thu, Apr 30, 2020 at 06:56:21AM +0800, Yang Shi wrote:
>> The max_ptes_{swap|none|shared} are defined to tune the behavior of
>> khugepaged. The are checked at a couple of places with open coding.
>> Replace the opencoding to exceed_pax_ptes_{swap|none_shared} helpers to
>> improve the readability.
>>
>> Cc: Kirill A. Shutemov <[email protected]>
>> Cc: Hugh Dickins <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Signed-off-by: Yang Shi <[email protected]>
>> ---
>> mm/khugepaged.c | 27 +++++++++++++++++++++------
>> 1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index a02a4c5..0c8d30b 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -339,6 +339,21 @@ struct attribute_group khugepaged_attr_group = {
>> };
>> #endif /* CONFIG_SYSFS */
>>
>> +static inline bool exceed_max_ptes_none(unsigned int *nr_ptes)
>> +{
>> + return (++(*nr_ptes) > khugepaged_max_ptes_none);
>> +}
>> +
>> +static inline bool exceed_max_ptes_swap(unsigned int *nr_ptes)
>> +{
>> + return (++(*nr_ptes) > khugepaged_max_ptes_swap);
>> +}
>> +
>> +static inline bool exceed_max_ptes_shared(unsigned int *nr_ptes)
>> +{
>> + return (++(*nr_ptes) > khugepaged_max_ptes_shared);
>> +}
>> +
> Frankly, I find this ugly and confusing. Open-coded version is more
> readable to me.

I'm sorry you feel that way. I tend to agree that dereference looks not
good. The open-coded version is not hard to understand to me either.

They are checked at a couple of different places with different
variables, i.e. unmapped vs swap, and with different comparisons, > vs
<=. I just thought the helpers with unified name started with "exceed_"
may make it more self-explained and readable. Anyway this totally
depends on taste and I really don't insist on this change.

>

2020-05-01 01:11:31

by Hugh Dickins

[permalink] [raw]
Subject: Re: [linux-next PATCH 1/2] mm: khugepaged: add exceed_max_ptes_* helpers

On Thu, 30 Apr 2020, Yang Shi wrote:
> On 4/30/20 2:59 PM, Kirill A. Shutemov wrote:
> > On Thu, Apr 30, 2020 at 06:56:21AM +0800, Yang Shi wrote:
> > > The max_ptes_{swap|none|shared} are defined to tune the behavior of
> > > khugepaged. The are checked at a couple of places with open coding.
> > > Replace the opencoding to exceed_pax_ptes_{swap|none_shared} helpers to
> > > improve the readability.
> > >
> > > Cc: Kirill A. Shutemov <[email protected]>
> > > Cc: Hugh Dickins <[email protected]>
> > > Cc: Andrea Arcangeli <[email protected]>
> > > Signed-off-by: Yang Shi <[email protected]>
> > > ---
> > > mm/khugepaged.c | 27 +++++++++++++++++++++------
> > > 1 file changed, 21 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index a02a4c5..0c8d30b 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -339,6 +339,21 @@ struct attribute_group khugepaged_attr_group = {
> > > };
> > > #endif /* CONFIG_SYSFS */
> > > +static inline bool exceed_max_ptes_none(unsigned int *nr_ptes)
> > > +{
> > > + return (++(*nr_ptes) > khugepaged_max_ptes_none);
> > > +}
> > > +
> > > +static inline bool exceed_max_ptes_swap(unsigned int *nr_ptes)
> > > +{
> > > + return (++(*nr_ptes) > khugepaged_max_ptes_swap);
> > > +}
> > > +
> > > +static inline bool exceed_max_ptes_shared(unsigned int *nr_ptes)
> > > +{
> > > + return (++(*nr_ptes) > khugepaged_max_ptes_shared);
> > > +}
> > > +
> > Frankly, I find this ugly and confusing. Open-coded version is more
> > readable to me.

Wow, yes, I strongly agree with Kirill.

>
> I'm sorry you feel that way. I tend to agree that dereference looks not good.
> The open-coded version is not hard to understand to me either.
>
> They are checked at a couple of different places with different variables,
> i.e. unmapped vs swap, and with different comparisons, > vs <=. I just
> thought the helpers with unified name started with "exceed_" may make it more
> self-explained and readable. Anyway this totally depends on taste and I
> really don't insist on this change.