2021-04-27 13:33:19

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 0/5] Cleanup and fixup for huge_memory

Hi all,
This series contains cleanups to remove dedicated macro and remove
unnecessary tlb_remove_page_size() for huge zero pmd. Also this adds
missing read-only THP checking for transparent_hugepage_enabled() and
avoids discarding hugepage if other processes are mapping it. More
details can be found in the respective changelogs. Thanks!

Miaohe Lin (5):
mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
mm/huge_memory.c: use page->deferred_list
mm/huge_memory.c: add missing read-only THP checking in
transparent_hugepage_enabled()
mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge
zero pmd
mm/huge_memory.c: don't discard hugepage if other processes are
mapping it

include/linux/huge_mm.h | 6 ++----
mm/huge_memory.c | 12 ++++++------
2 files changed, 8 insertions(+), 10 deletions(-)

--
2.23.0


2021-04-27 13:33:22

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 2/5] mm/huge_memory.c: use page->deferred_list

Now that we can represent the location of ->deferred_list instead of
->mapping + ->index, make use of it to improve readability.

Signed-off-by: Miaohe Lin <[email protected]>
---
mm/huge_memory.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 63ed6b25deaa..76ca1eb2a223 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2868,7 +2868,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
/* Take pin on all head pages to avoid freeing them under us */
list_for_each_safe(pos, next, &ds_queue->split_queue) {
- page = list_entry((void *)pos, struct page, mapping);
+ page = list_entry((void *)pos, struct page, deferred_list);
page = compound_head(page);
if (get_page_unless_zero(page)) {
list_move(page_deferred_list(page), &list);
@@ -2883,7 +2883,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);

list_for_each_safe(pos, next, &list) {
- page = list_entry((void *)pos, struct page, mapping);
+ page = list_entry((void *)pos, struct page, deferred_list);
if (!trylock_page(page))
goto next;
/* split_huge_page() removes page from list on success */
--
2.23.0

2021-04-27 13:33:34

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 1/5] mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK

Rewrite the pgoff checking logic to remove macro HPAGE_CACHE_INDEX_MASK
which is only used here to simplify the code.

Signed-off-by: Miaohe Lin <[email protected]>
---
include/linux/huge_mm.h | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 9626fda5efce..0a526f211fec 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -152,15 +152,13 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)

bool transparent_hugepage_enabled(struct vm_area_struct *vma);

-#define HPAGE_CACHE_INDEX_MASK (HPAGE_PMD_NR - 1)
-
static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
unsigned long haddr)
{
/* Don't have to check pgoff for anonymous vma */
if (!vma_is_anonymous(vma)) {
- if (((vma->vm_start >> PAGE_SHIFT) & HPAGE_CACHE_INDEX_MASK) !=
- (vma->vm_pgoff & HPAGE_CACHE_INDEX_MASK))
+ if (!IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
+ HPAGE_PMD_NR))
return false;
}

--
2.23.0

2021-04-27 13:34:31

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 5/5] mm/huge_memory.c: don't discard hugepage if other processes are mapping it

If other processes are mapping any other subpages of the hugepage, i.e. in
pte-mapped thp case, page_mapcount() will return 1 incorrectly. Then we
would discard the page while other processes are still mapping it. Fix it
by using total_mapcount() which can tell whether other processes are still
mapping it.

Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Signed-off-by: Miaohe Lin <[email protected]>
---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f652be6ecca3..d14fecb8cd00 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1604,7 +1604,7 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
* If other processes are mapping this page, we couldn't discard
* the page unless they all do MADV_FREE so let's skip the page.
*/
- if (page_mapcount(page) != 1)
+ if (total_mapcount(page) != 1)
goto out;

if (!trylock_page(page))
--
2.23.0

2021-04-27 13:34:36

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 3/5] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()

Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
(non-shmem) FS"), read-only THP file mapping is supported. But it
forgot to add checking for it in transparent_hugepage_enabled().

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Miaohe Lin <[email protected]>
---
mm/huge_memory.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 76ca1eb2a223..aa22a0ae9894 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -74,6 +74,9 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
return __transparent_hugepage_enabled(vma);
if (vma_is_shmem(vma))
return shmem_huge_enabled(vma);
+ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
+ (vma->vm_flags & VM_DENYWRITE))
+ return true;

return false;
}
--
2.23.0

2021-04-27 13:34:57

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH 4/5] mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge zero pmd

Commit aa88b68c3b1d ("thp: keep huge zero page pinned until tlb flush")
introduced tlb_remove_page() for huge zero page to keep it pinned until
flush is complete and prevents the page from being split under us. But
huge zero page is kept pinned until all relevant mm_users reach zero since
the commit 6fcb52a56ff6 ("thp: reduce usage of huge zero page's atomic
counter"). So tlb_remove_page_size() for huge zero pmd is unnecessary now.

Signed-off-by: Miaohe Lin <[email protected]>
---
mm/huge_memory.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index aa22a0ae9894..f652be6ecca3 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1677,12 +1677,9 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
if (arch_needs_pgtable_deposit())
zap_deposited_table(tlb->mm, pmd);
spin_unlock(ptl);
- if (is_huge_zero_pmd(orig_pmd))
- tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
} else if (is_huge_zero_pmd(orig_pmd)) {
zap_deposited_table(tlb->mm, pmd);
spin_unlock(ptl);
- tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
} else {
struct page *page = NULL;
int flush_needed = 1;
--
2.23.0

2021-04-27 20:49:15

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm/huge_memory.c: use page->deferred_list

On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>
> Now that we can represent the location of ->deferred_list instead of
> ->mapping + ->index, make use of it to improve readability.

Reviewed-by: Yang Shi <[email protected]>

>
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> mm/huge_memory.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b25deaa..76ca1eb2a223 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2868,7 +2868,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
> /* Take pin on all head pages to avoid freeing them under us */
> list_for_each_safe(pos, next, &ds_queue->split_queue) {
> - page = list_entry((void *)pos, struct page, mapping);
> + page = list_entry((void *)pos, struct page, deferred_list);
> page = compound_head(page);
> if (get_page_unless_zero(page)) {
> list_move(page_deferred_list(page), &list);
> @@ -2883,7 +2883,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>
> list_for_each_safe(pos, next, &list) {
> - page = list_entry((void *)pos, struct page, mapping);
> + page = list_entry((void *)pos, struct page, deferred_list);
> if (!trylock_page(page))
> goto next;
> /* split_huge_page() removes page from list on success */
> --
> 2.23.0
>
>

2021-04-27 20:58:25

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 1/5] mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK

On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>
> Rewrite the pgoff checking logic to remove macro HPAGE_CACHE_INDEX_MASK
> which is only used here to simplify the code.

Reviewed-by: Yang Shi <[email protected]>

>
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> include/linux/huge_mm.h | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 9626fda5efce..0a526f211fec 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -152,15 +152,13 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>
> bool transparent_hugepage_enabled(struct vm_area_struct *vma);
>
> -#define HPAGE_CACHE_INDEX_MASK (HPAGE_PMD_NR - 1)
> -
> static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
> unsigned long haddr)
> {
> /* Don't have to check pgoff for anonymous vma */
> if (!vma_is_anonymous(vma)) {
> - if (((vma->vm_start >> PAGE_SHIFT) & HPAGE_CACHE_INDEX_MASK) !=
> - (vma->vm_pgoff & HPAGE_CACHE_INDEX_MASK))
> + if (!IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
> + HPAGE_PMD_NR))
> return false;
> }
>
> --
> 2.23.0
>
>

2021-04-27 21:05:17

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()

On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>
> Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
> (non-shmem) FS"), read-only THP file mapping is supported. But it
> forgot to add checking for it in transparent_hugepage_enabled().
>
> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> mm/huge_memory.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 76ca1eb2a223..aa22a0ae9894 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -74,6 +74,9 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
> return __transparent_hugepage_enabled(vma);
> if (vma_is_shmem(vma))
> return shmem_huge_enabled(vma);
> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
> + (vma->vm_flags & VM_DENYWRITE))
> + return true;

I don't think this change is correct. This function is used to
indicate if allocating THP is eligible for the VMAs or not showed by
smap. And currently readonly FS THP is collapsed by khugepaged only.

So, you need check if the vma is suitable for khugepaged. Take a look
at what hugepage_vma_check() does.

And, the new patch
(https://lore.kernel.org/linux-mm/[email protected]/)
relax the constraints for readonly FS THP, it might be already in -mm
tree, so you need adopt the new condition as well.

>
> return false;
> }

> --
> 2.23.0
>
>

2021-04-27 21:25:44

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 5/5] mm/huge_memory.c: don't discard hugepage if other processes are mapping it

On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>
> If other processes are mapping any other subpages of the hugepage, i.e. in
> pte-mapped thp case, page_mapcount() will return 1 incorrectly. Then we
> would discard the page while other processes are still mapping it. Fix it
> by using total_mapcount() which can tell whether other processes are still
> mapping it.

Seems correct to me. It is possible that the THP is PTE-mapped by the
other processes.

Reviewed-by: Yang Shi <[email protected]>

>
> Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> mm/huge_memory.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f652be6ecca3..d14fecb8cd00 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1604,7 +1604,7 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> * If other processes are mapping this page, we couldn't discard
> * the page unless they all do MADV_FREE so let's skip the page.
> */
> - if (page_mapcount(page) != 1)
> + if (total_mapcount(page) != 1)
> goto out;
>
> if (!trylock_page(page))
> --
> 2.23.0
>
>

2021-04-28 02:07:53

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()

On 2021/4/28 5:03, Yang Shi wrote:
> On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>>
>> Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
>> (non-shmem) FS"), read-only THP file mapping is supported. But it
>> forgot to add checking for it in transparent_hugepage_enabled().
>>
>> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
>> Signed-off-by: Miaohe Lin <[email protected]>
>> ---
>> mm/huge_memory.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 76ca1eb2a223..aa22a0ae9894 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -74,6 +74,9 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>> return __transparent_hugepage_enabled(vma);
>> if (vma_is_shmem(vma))
>> return shmem_huge_enabled(vma);
>> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
>> + (vma->vm_flags & VM_DENYWRITE))
>> + return true;
>

Many thanks for your quick respond and Reviewed-by tag!

> I don't think this change is correct. This function is used to
> indicate if allocating THP is eligible for the VMAs or not showed by
> smap. And currently readonly FS THP is collapsed by khugepaged only.
>
> So, you need check if the vma is suitable for khugepaged. Take a look
> at what hugepage_vma_check() does.
>
> And, the new patch
> (https://lore.kernel.org/linux-mm/[email protected]/)
> relax the constraints for readonly FS THP, it might be already in -mm
> tree, so you need adopt the new condition as well.
>

Many thanks for your comment. I referred to what hugepage_vma_check() does about
Read-only file mappings when I came up this patch. But it seems I am miss something.
Take the new patch into account, the check for READ_ONLY_THP now should be:

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 76ca1eb2a223..a46a558233b4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -74,6 +74,10 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
return __transparent_hugepage_enabled(vma);
if (vma_is_shmem(vma))
return shmem_huge_enabled(vma);
+ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
+ !inode_is_open_for_write(vma->vm_file->f_inode) &&
+ (vma->vm_flags & VM_EXEC))
+ return true;

return false;
}

Am I miss something about checking for READ_ONLY_THP case? Or READ_ONLY_THP case is ok
but other case is missed? Could you please explain this more detailed for me?

Many thanks!

>>
>> return false;
>> }
>
>> --
>> 2.23.0
>>
>>
>
> .
>

2021-04-28 03:04:12

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH 1/5] mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK


On 4/27/21 7:02 PM, Miaohe Lin wrote:
> Rewrite the pgoff checking logic to remove macro HPAGE_CACHE_INDEX_MASK
> which is only used here to simplify the code.
>
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> include/linux/huge_mm.h | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 9626fda5efce..0a526f211fec 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -152,15 +152,13 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>
> bool transparent_hugepage_enabled(struct vm_area_struct *vma);
>
> -#define HPAGE_CACHE_INDEX_MASK (HPAGE_PMD_NR - 1)
> -
> static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
> unsigned long haddr)
> {
> /* Don't have to check pgoff for anonymous vma */
> if (!vma_is_anonymous(vma)) {
> - if (((vma->vm_start >> PAGE_SHIFT) & HPAGE_CACHE_INDEX_MASK) !=
> - (vma->vm_pgoff & HPAGE_CACHE_INDEX_MASK))
> + if (!IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
> + HPAGE_PMD_NR))
> return false;
> }
>
>

LGTM

Reviewed-by: Anshuman Khandual <[email protected]>

2021-04-28 03:07:41

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm/huge_memory.c: use page->deferred_list


On 4/27/21 7:02 PM, Miaohe Lin wrote:
> Now that we can represent the location of ->deferred_list instead of
> ->mapping + ->index, make use of it to improve readability.

Could you please explain how page->deferred_list and page->mapping
are interchangeable here ?

>
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> mm/huge_memory.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b25deaa..76ca1eb2a223 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2868,7 +2868,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
> /* Take pin on all head pages to avoid freeing them under us */
> list_for_each_safe(pos, next, &ds_queue->split_queue) {
> - page = list_entry((void *)pos, struct page, mapping);
> + page = list_entry((void *)pos, struct page, deferred_list);
> page = compound_head(page);
> if (get_page_unless_zero(page)) {
> list_move(page_deferred_list(page), &list);
> @@ -2883,7 +2883,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>
> list_for_each_safe(pos, next, &list) {
> - page = list_entry((void *)pos, struct page, mapping);
> + page = list_entry((void *)pos, struct page, deferred_list);
> if (!trylock_page(page))
> goto next;
> /* split_huge_page() removes page from list on success */
>

2021-04-28 03:11:46

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH 0/5] Cleanup and fixup for huge_memory



On 4/27/21 7:02 PM, Miaohe Lin wrote:
> Hi all,
> This series contains cleanups to remove dedicated macro and remove
> unnecessary tlb_remove_page_size() for huge zero pmd. Also this adds
> missing read-only THP checking for transparent_hugepage_enabled() and
> avoids discarding hugepage if other processes are mapping it. More
> details can be found in the respective changelogs. Thanks!
>
> Miaohe Lin (5):
> mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
> mm/huge_memory.c: use page->deferred_list
> mm/huge_memory.c: add missing read-only THP checking in
> transparent_hugepage_enabled()
> mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge
> zero pmd
> mm/huge_memory.c: don't discard hugepage if other processes are
> mapping it

I guess it might be just better to split the series into cleans-ups
without functional change and then fixes separately.

2021-04-28 08:24:38

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm/huge_memory.c: use page->deferred_list

On 2021/4/28 11:07, Anshuman Khandual wrote:
>
> On 4/27/21 7:02 PM, Miaohe Lin wrote:
>> Now that we can represent the location of ->deferred_list instead of
>> ->mapping + ->index, make use of it to improve readability.
>
> Could you please explain how page->deferred_list and page->mapping
> are interchangeable here ?

It's because there is a union in struct page:

union {
struct {
struct list_head lru;
struct address_space *mapping;
pgoff_t index;
unsigned long private;
};
struct {
unsigned long _compound_pad_1;
atomic_t hpage_pinned_refcount;
struct list_head *deferred_list*;
};
};

And initially there is no deferred_list and it's added via commit 66a6ffd2af42 ("mm: combine first
three unions in struct page"). Also commit fa3015b7eed5 ("mm: use page->deferred_list") did the similar
cleanup. Am I expected to add these to the commit log?

Many thanks.

>
>>
>> Signed-off-by: Miaohe Lin <[email protected]>
>> ---
>> mm/huge_memory.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 63ed6b25deaa..76ca1eb2a223 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2868,7 +2868,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>> /* Take pin on all head pages to avoid freeing them under us */
>> list_for_each_safe(pos, next, &ds_queue->split_queue) {
>> - page = list_entry((void *)pos, struct page, mapping);
>> + page = list_entry((void *)pos, struct page, deferred_list);
>> page = compound_head(page);
>> if (get_page_unless_zero(page)) {
>> list_move(page_deferred_list(page), &list);
>> @@ -2883,7 +2883,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>>
>> list_for_each_safe(pos, next, &list) {
>> - page = list_entry((void *)pos, struct page, mapping);
>> + page = list_entry((void *)pos, struct page, deferred_list);
>> if (!trylock_page(page))
>> goto next;
>> /* split_huge_page() removes page from list on success */
>>
> .
>

2021-04-28 08:33:49

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH 0/5] Cleanup and fixup for huge_memory

On 2021/4/28 11:10, Anshuman Khandual wrote:
>
>
> On 4/27/21 7:02 PM, Miaohe Lin wrote:
>> Hi all,
>> This series contains cleanups to remove dedicated macro and remove
>> unnecessary tlb_remove_page_size() for huge zero pmd. Also this adds
>> missing read-only THP checking for transparent_hugepage_enabled() and
>> avoids discarding hugepage if other processes are mapping it. More
>> details can be found in the respective changelogs. Thanks!
>>
>> Miaohe Lin (5):
>> mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
>> mm/huge_memory.c: use page->deferred_list
>> mm/huge_memory.c: add missing read-only THP checking in
>> transparent_hugepage_enabled()
>> mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge
>> zero pmd
>> mm/huge_memory.c: don't discard hugepage if other processes are
>> mapping it
>
> I guess it might be just better to split the series into cleans-ups
> without functional change and then fixes separately.

Sounds reasonable. But IMO all of these changes are pretty simple and independent,
maybe it's ok to keep these together?

Many thanks for comment.

> .
>

2021-04-28 17:41:12

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()

On Tue, Apr 27, 2021 at 7:06 PM Miaohe Lin <[email protected]> wrote:
>
> On 2021/4/28 5:03, Yang Shi wrote:
> > On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
> >>
> >> Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
> >> (non-shmem) FS"), read-only THP file mapping is supported. But it
> >> forgot to add checking for it in transparent_hugepage_enabled().
> >>
> >> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
> >> Signed-off-by: Miaohe Lin <[email protected]>
> >> ---
> >> mm/huge_memory.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 76ca1eb2a223..aa22a0ae9894 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -74,6 +74,9 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
> >> return __transparent_hugepage_enabled(vma);
> >> if (vma_is_shmem(vma))
> >> return shmem_huge_enabled(vma);
> >> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
> >> + (vma->vm_flags & VM_DENYWRITE))
> >> + return true;
> >
>
> Many thanks for your quick respond and Reviewed-by tag!
>
> > I don't think this change is correct. This function is used to
> > indicate if allocating THP is eligible for the VMAs or not showed by
> > smap. And currently readonly FS THP is collapsed by khugepaged only.
> >
> > So, you need check if the vma is suitable for khugepaged. Take a look
> > at what hugepage_vma_check() does.
> >
> > And, the new patch
> > (https://lore.kernel.org/linux-mm/[email protected]/)
> > relax the constraints for readonly FS THP, it might be already in -mm
> > tree, so you need adopt the new condition as well.
> >
>
> Many thanks for your comment. I referred to what hugepage_vma_check() does about
> Read-only file mappings when I came up this patch. But it seems I am miss something.

Yes, you need do the below check for readonly FS THP too:

if ((vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
return false;

This check is done separately for anonymous and shmem. It seems not
perfect to do it three times in a row. So I'd suggest extracting the
check into a common helper then call it at the top of
transparent_hugepage_enabled() .

The helper also could replace the same check in
__transparent_hugepage_enabled() and shmem_huge_enabled().

> Take the new patch into account, the check for READ_ONLY_THP now should be:
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 76ca1eb2a223..a46a558233b4 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -74,6 +74,10 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
> return __transparent_hugepage_enabled(vma);
> if (vma_is_shmem(vma))
> return shmem_huge_enabled(vma);
> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
> + !inode_is_open_for_write(vma->vm_file->f_inode) &&
> + (vma->vm_flags & VM_EXEC))
> + return true;
>
> return false;
> }
>
> Am I miss something about checking for READ_ONLY_THP case? Or READ_ONLY_THP case is ok
> but other case is missed? Could you please explain this more detailed for me?
>
> Many thanks!
>
> >>
> >> return false;
> >> }
> >
> >> --
> >> 2.23.0
> >>
> >>
> >
> > .
> >
>

2021-04-29 02:01:55

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()

On 2021/4/29 0:21, Yang Shi wrote:
> On Tue, Apr 27, 2021 at 7:06 PM Miaohe Lin <[email protected]> wrote:
>>
>> On 2021/4/28 5:03, Yang Shi wrote:
>>> On Tue, Apr 27, 2021 at 6:32 AM Miaohe Lin <[email protected]> wrote:
>>>>
>>>> Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
>>>> (non-shmem) FS"), read-only THP file mapping is supported. But it
>>>> forgot to add checking for it in transparent_hugepage_enabled().
>>>>
>>>> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
>>>> Signed-off-by: Miaohe Lin <[email protected]>
>>>> ---
>>>> mm/huge_memory.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>> index 76ca1eb2a223..aa22a0ae9894 100644
>>>> --- a/mm/huge_memory.c
>>>> +++ b/mm/huge_memory.c
>>>> @@ -74,6 +74,9 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>>>> return __transparent_hugepage_enabled(vma);
>>>> if (vma_is_shmem(vma))
>>>> return shmem_huge_enabled(vma);
>>>> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
>>>> + (vma->vm_flags & VM_DENYWRITE))
>>>> + return true;
>>>
>>
>> Many thanks for your quick respond and Reviewed-by tag!
>>
>>> I don't think this change is correct. This function is used to
>>> indicate if allocating THP is eligible for the VMAs or not showed by
>>> smap. And currently readonly FS THP is collapsed by khugepaged only.
>>>
>>> So, you need check if the vma is suitable for khugepaged. Take a look
>>> at what hugepage_vma_check() does.
>>>
>>> And, the new patch
>>> (https://lore.kernel.org/linux-mm/[email protected]/)
>>> relax the constraints for readonly FS THP, it might be already in -mm
>>> tree, so you need adopt the new condition as well.
>>>
>>
>> Many thanks for your comment. I referred to what hugepage_vma_check() does about
>> Read-only file mappings when I came up this patch. But it seems I am miss something.
>
> Yes, you need do the below check for readonly FS THP too:
>
> if ((vm_flags & VM_NOHUGEPAGE) ||
> test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> return false;
>
> This check is done separately for anonymous and shmem. It seems not
> perfect to do it three times in a row. So I'd suggest extracting the
> check into a common helper then call it at the top of
> transparent_hugepage_enabled() .
>
> The helper also could replace the same check in
> __transparent_hugepage_enabled() and shmem_huge_enabled().
>

I see. Many thanks for detailed explanation and good suggestion! Will do it in v2. :)

>> Take the new patch into account, the check for READ_ONLY_THP now should be:
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 76ca1eb2a223..a46a558233b4 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -74,6 +74,10 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>> return __transparent_hugepage_enabled(vma);
>> if (vma_is_shmem(vma))
>> return shmem_huge_enabled(vma);
>> + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
>> + !inode_is_open_for_write(vma->vm_file->f_inode) &&
>> + (vma->vm_flags & VM_EXEC))
>> + return true;
>>
>> return false;
>> }
>>
>> Am I miss something about checking for READ_ONLY_THP case? Or READ_ONLY_THP case is ok
>> but other case is missed? Could you please explain this more detailed for me?
>>
>> Many thanks!
>>
>>>>
>>>> return false;
>>>> }
>>>
>>>> --
>>>> 2.23.0
>>>>
>>>>
>>>
>>> .
>>>
>>
> .
>