2019-06-13 16:49:21

by Yang Shi

[permalink] [raw]
Subject: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
vma") introduced THPeligible bit for processes' smaps. But, when checking
the eligibility for shmem vma, __transparent_hugepage_enabled() is
called to override the result from shmem_huge_enabled(). It may result
in the anonymous vma's THP flag override shmem's. For example, running a
simple test which create THP for shmem, but with anonymous THP disabled,
when reading the process's smaps, it may show:

7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
Size: 4096 kB
...
[snip]
...
ShmemPmdMapped: 4096 kB
...
[snip]
...
THPeligible: 0

And, /proc/meminfo does show THP allocated and PMD mapped too:

ShmemHugePages: 4096 kB
ShmemPmdMapped: 4096 kB

This doesn't make too much sense. The shmem objects should be treated
separately from anonymous THP. Calling shmem_huge_enabled() with checking
MMF_DISABLE_THP sounds good enough. And, we could skip stack and
dax vma check since we already checked if the vma is shmem already.

Also check if vma is suitable for THP by calling
transhuge_vma_suitable().

And minor fix to smaps output format and documentation.

Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
Cc: Hugh Dickins <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
Documentation/filesystems/proc.txt | 4 ++--
fs/proc/task_mmu.c | 3 ++-
mm/huge_memory.c | 9 +++++++--
mm/shmem.c | 3 +++
4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 66cad5c..b0ded06 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -477,8 +477,8 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
"SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
does not take into account swapped out page of underlying shmem objects.
"Locked" indicates whether the mapping is locked in memory or not.
-"THPeligible" indicates whether the mapping is eligible for THP pages - 1 if
-true, 0 otherwise.
+"THPeligible" indicates whether the mapping is eligible for allocating THP
+pages - 1 if true, 0 otherwise. It just shows the current status.

"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 01d4eb0..6a13882 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -796,7 +796,8 @@ static int show_smap(struct seq_file *m, void *v)

__show_smap(m, &mss);

- seq_printf(m, "THPeligible: %d\n", transparent_hugepage_enabled(vma));
+ seq_printf(m, "THPeligible: %d\n",
+ transparent_hugepage_enabled(vma));

if (arch_pkeys_enabled())
seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4bc2552..36f0225 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -65,10 +65,15 @@

bool transparent_hugepage_enabled(struct vm_area_struct *vma)
{
+ /* The addr is used to check if the vma size fits */
+ unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
+
+ if (!transhuge_vma_suitable(vma, addr))
+ return false;
if (vma_is_anonymous(vma))
return __transparent_hugepage_enabled(vma);
- if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
- return __transparent_hugepage_enabled(vma);
+ if (vma_is_shmem(vma))
+ return shmem_huge_enabled(vma);

return false;
}
diff --git a/mm/shmem.c b/mm/shmem.c
index 1bb3b8d..a807712 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3872,6 +3872,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
loff_t i_size;
pgoff_t off;

+ if ((vma->vm_flags & VM_NOHUGEPAGE) ||
+ test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
+ return false;
if (shmem_huge == SHMEM_HUGE_FORCE)
return true;
if (shmem_huge == SHMEM_HUGE_DENY)
--
1.8.3.1


2019-06-19 12:15:14

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

On 6/13/19 6:44 AM, Yang Shi wrote:
> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
> vma") introduced THPeligible bit for processes' smaps. But, when checking
> the eligibility for shmem vma, __transparent_hugepage_enabled() is
> called to override the result from shmem_huge_enabled(). It may result
> in the anonymous vma's THP flag override shmem's. For example, running a
> simple test which create THP for shmem, but with anonymous THP disabled,
> when reading the process's smaps, it may show:

...

> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 01d4eb0..6a13882 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -796,7 +796,8 @@ static int show_smap(struct seq_file *m, void *v)
>
> __show_smap(m, &mss);
>
> - seq_printf(m, "THPeligible: %d\n", transparent_hugepage_enabled(vma));
> + seq_printf(m, "THPeligible: %d\n",
> + transparent_hugepage_enabled(vma));
>
> if (arch_pkeys_enabled())
> seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 4bc2552..36f0225 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -65,10 +65,15 @@
>
> bool transparent_hugepage_enabled(struct vm_area_struct *vma)
> {
> + /* The addr is used to check if the vma size fits */
> + unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
> +
> + if (!transhuge_vma_suitable(vma, addr))
> + return false;

Sorry for replying rather late, and not in the v2 thread, but unlike
Hugh I'm not convinced that we should include vma size/alignment in the
test for reporting THPeligible, which was supposed to reflect
administrative settings and madvise hints. I guess it's mostly a matter
of personal feeling. But one objective distinction is that the admin
settings and madvise do have an exact binary result for the whole VMA,
while this check is more fuzzy - only part of the VMA's span might be
properly sized+aligned, and THPeligible will be 1 for the whole VMA.

> if (vma_is_anonymous(vma))
> return __transparent_hugepage_enabled(vma);
> - if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
> - return __transparent_hugepage_enabled(vma);
> + if (vma_is_shmem(vma))
> + return shmem_huge_enabled(vma);
>
> return false;
> }
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 1bb3b8d..a807712 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3872,6 +3872,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
> loff_t i_size;
> pgoff_t off;
>
> + if ((vma->vm_flags & VM_NOHUGEPAGE) ||
> + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> + return false;
> if (shmem_huge == SHMEM_HUGE_FORCE)
> return true;
> if (shmem_huge == SHMEM_HUGE_DENY)
>

2019-06-19 16:30:01

by Yang Shi

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility



On 6/19/19 5:12 AM, Vlastimil Babka wrote:
> On 6/13/19 6:44 AM, Yang Shi wrote:
>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
>> vma") introduced THPeligible bit for processes' smaps. But, when checking
>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>> called to override the result from shmem_huge_enabled(). It may result
>> in the anonymous vma's THP flag override shmem's. For example, running a
>> simple test which create THP for shmem, but with anonymous THP disabled,
>> when reading the process's smaps, it may show:
> ...
>
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index 01d4eb0..6a13882 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
>> @@ -796,7 +796,8 @@ static int show_smap(struct seq_file *m, void *v)
>>
>> __show_smap(m, &mss);
>>
>> - seq_printf(m, "THPeligible: %d\n", transparent_hugepage_enabled(vma));
>> + seq_printf(m, "THPeligible: %d\n",
>> + transparent_hugepage_enabled(vma));
>>
>> if (arch_pkeys_enabled())
>> seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 4bc2552..36f0225 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -65,10 +65,15 @@
>>
>> bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>> {
>> + /* The addr is used to check if the vma size fits */
>> + unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
>> +
>> + if (!transhuge_vma_suitable(vma, addr))
>> + return false;
> Sorry for replying rather late, and not in the v2 thread, but unlike
> Hugh I'm not convinced that we should include vma size/alignment in the
> test for reporting THPeligible, which was supposed to reflect
> administrative settings and madvise hints. I guess it's mostly a matter
> of personal feeling. But one objective distinction is that the admin
> settings and madvise do have an exact binary result for the whole VMA,
> while this check is more fuzzy - only part of the VMA's span might be
> properly sized+aligned, and THPeligible will be 1 for the whole VMA.

I think THPeligible is used to tell us if the vma is suitable for
allocating THP. Both anonymous and shmem THP checks vma size/alignment
to decide to or not to allocate THP.

And, if vma size/alignment is not checked, THPeligible may show "true"
for even 4K mapping. This doesn't make too much sense either.

>
>> if (vma_is_anonymous(vma))
>> return __transparent_hugepage_enabled(vma);
>> - if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>> - return __transparent_hugepage_enabled(vma);
>> + if (vma_is_shmem(vma))
>> + return shmem_huge_enabled(vma);
>>
>> return false;
>> }
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 1bb3b8d..a807712 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3872,6 +3872,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>> loff_t i_size;
>> pgoff_t off;
>>
>> + if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>> + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>> + return false;
>> if (shmem_huge == SHMEM_HUGE_FORCE)
>> return true;
>> if (shmem_huge == SHMEM_HUGE_DENY)
>>

2019-07-17 19:46:28

by Hugh Dickins

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

On Thu, 13 Jun 2019, Yang Shi wrote:

> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
> vma") introduced THPeligible bit for processes' smaps. But, when checking
> the eligibility for shmem vma, __transparent_hugepage_enabled() is
> called to override the result from shmem_huge_enabled(). It may result
> in the anonymous vma's THP flag override shmem's. For example, running a
> simple test which create THP for shmem, but with anonymous THP disabled,
> when reading the process's smaps, it may show:
>
> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> Size: 4096 kB
> ...
> [snip]
> ...
> ShmemPmdMapped: 4096 kB
> ...
> [snip]
> ...
> THPeligible: 0
>
> And, /proc/meminfo does show THP allocated and PMD mapped too:
>
> ShmemHugePages: 4096 kB
> ShmemPmdMapped: 4096 kB
>
> This doesn't make too much sense. The shmem objects should be treated
> separately from anonymous THP. Calling shmem_huge_enabled() with checking
> MMF_DISABLE_THP sounds good enough. And, we could skip stack and
> dax vma check since we already checked if the vma is shmem already.
>
> Also check if vma is suitable for THP by calling
> transhuge_vma_suitable().
>
> And minor fix to smaps output format and documentation.
>
> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
> Cc: Hugh Dickins <[email protected]>

Thanks,
Acked-by: Hugh Dickins <[email protected]>

> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: David Rientjes <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>
> ---
> Documentation/filesystems/proc.txt | 4 ++--
> fs/proc/task_mmu.c | 3 ++-
> mm/huge_memory.c | 9 +++++++--
> mm/shmem.c | 3 +++
> 4 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index 66cad5c..b0ded06 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -477,8 +477,8 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
> "SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
> does not take into account swapped out page of underlying shmem objects.
> "Locked" indicates whether the mapping is locked in memory or not.
> -"THPeligible" indicates whether the mapping is eligible for THP pages - 1 if
> -true, 0 otherwise.
> +"THPeligible" indicates whether the mapping is eligible for allocating THP
> +pages - 1 if true, 0 otherwise. It just shows the current status.
>
> "VmFlags" field deserves a separate description. This member represents the kernel
> flags associated with the particular virtual memory area in two letter encoded
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 01d4eb0..6a13882 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -796,7 +796,8 @@ static int show_smap(struct seq_file *m, void *v)
>
> __show_smap(m, &mss);
>
> - seq_printf(m, "THPeligible: %d\n", transparent_hugepage_enabled(vma));
> + seq_printf(m, "THPeligible: %d\n",
> + transparent_hugepage_enabled(vma));
>
> if (arch_pkeys_enabled())
> seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 4bc2552..36f0225 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -65,10 +65,15 @@
>
> bool transparent_hugepage_enabled(struct vm_area_struct *vma)
> {
> + /* The addr is used to check if the vma size fits */
> + unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
> +
> + if (!transhuge_vma_suitable(vma, addr))
> + return false;
> if (vma_is_anonymous(vma))
> return __transparent_hugepage_enabled(vma);
> - if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
> - return __transparent_hugepage_enabled(vma);
> + if (vma_is_shmem(vma))
> + return shmem_huge_enabled(vma);
>
> return false;
> }
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 1bb3b8d..a807712 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3872,6 +3872,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
> loff_t i_size;
> pgoff_t off;
>
> + if ((vma->vm_flags & VM_NOHUGEPAGE) ||
> + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> + return false;
> if (shmem_huge == SHMEM_HUGE_FORCE)
> return true;
> if (shmem_huge == SHMEM_HUGE_DENY)
> --
> 1.8.3.1

2019-07-18 21:45:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

On Wed, 19 Jun 2019 09:28:42 -0700 Yang Shi <[email protected]> wrote:

> > Sorry for replying rather late, and not in the v2 thread, but unlike
> > Hugh I'm not convinced that we should include vma size/alignment in the
> > test for reporting THPeligible, which was supposed to reflect
> > administrative settings and madvise hints. I guess it's mostly a matter
> > of personal feeling. But one objective distinction is that the admin
> > settings and madvise do have an exact binary result for the whole VMA,
> > while this check is more fuzzy - only part of the VMA's span might be
> > properly sized+aligned, and THPeligible will be 1 for the whole VMA.
>
> I think THPeligible is used to tell us if the vma is suitable for
> allocating THP. Both anonymous and shmem THP checks vma size/alignment
> to decide to or not to allocate THP.
>
> And, if vma size/alignment is not checked, THPeligible may show "true"
> for even 4K mapping. This doesn't make too much sense either.

This discussion seems rather inconclusive. I'll merge up the patchset
anyway. Vlastimil, if you think some changes are needed here then
please let's get them sorted out over the next few weeks?

2019-07-18 21:54:03

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

On 7/18/19 11:44 PM, Andrew Morton wrote:
> On Wed, 19 Jun 2019 09:28:42 -0700 Yang Shi <[email protected]> wrote:
>
>>> Sorry for replying rather late, and not in the v2 thread, but unlike
>>> Hugh I'm not convinced that we should include vma size/alignment in the
>>> test for reporting THPeligible, which was supposed to reflect
>>> administrative settings and madvise hints. I guess it's mostly a matter
>>> of personal feeling. But one objective distinction is that the admin
>>> settings and madvise do have an exact binary result for the whole VMA,
>>> while this check is more fuzzy - only part of the VMA's span might be
>>> properly sized+aligned, and THPeligible will be 1 for the whole VMA.
>>
>> I think THPeligible is used to tell us if the vma is suitable for
>> allocating THP. Both anonymous and shmem THP checks vma size/alignment
>> to decide to or not to allocate THP.
>>
>> And, if vma size/alignment is not checked, THPeligible may show "true"
>> for even 4K mapping. This doesn't make too much sense either.
>
> This discussion seems rather inconclusive. I'll merge up the patchset
> anyway. Vlastimil, if you think some changes are needed here then
> please let's get them sorted out over the next few weeks?

Well, Hugh did ack it, albeit without commenting on this part. I don't
feel strongly enough about this for a nack.

2019-07-18 22:09:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

On Thu, 18 Jul 2019, Vlastimil Babka wrote:
> On 7/18/19 11:44 PM, Andrew Morton wrote:
> > On Wed, 19 Jun 2019 09:28:42 -0700 Yang Shi <[email protected]> wrote:
> >
> >>> Sorry for replying rather late, and not in the v2 thread, but unlike
> >>> Hugh I'm not convinced that we should include vma size/alignment in the
> >>> test for reporting THPeligible, which was supposed to reflect
> >>> administrative settings and madvise hints. I guess it's mostly a matter
> >>> of personal feeling. But one objective distinction is that the admin
> >>> settings and madvise do have an exact binary result for the whole VMA,
> >>> while this check is more fuzzy - only part of the VMA's span might be
> >>> properly sized+aligned, and THPeligible will be 1 for the whole VMA.
> >>
> >> I think THPeligible is used to tell us if the vma is suitable for
> >> allocating THP. Both anonymous and shmem THP checks vma size/alignment
> >> to decide to or not to allocate THP.
> >>
> >> And, if vma size/alignment is not checked, THPeligible may show "true"
> >> for even 4K mapping. This doesn't make too much sense either.
> >
> > This discussion seems rather inconclusive. I'll merge up the patchset
> > anyway. Vlastimil, if you think some changes are needed here then
> > please let's get them sorted out over the next few weeks?
>
> Well, Hugh did ack it, albeit without commenting on this part. I don't
> feel strongly enough about this for a nack.

Right, I had no further comment: Yang and I agreed one way round,
you thought the other way. I was more persuaded by Yang's 4kB
example than by what you wrote; but not hugely excited either way.

Hugh